ADVANCES IN ACCOUNTING BEHAVIORAL RESEARCH
ADVANCES IN ACCOUNTING BEHAVIORAL RESEARCH Series Editor: Vicky Arnold Volumes 1–4:
Series Editor: James E. Hunton
Volumes 5–9:
Series Editor: Vicky Arnold
ADVANCES IN ACCOUNTING BEHAVIORAL RESEARCH
VOLUME 10
ADVANCES IN ACCOUNTING BEHAVIORAL RESEARCH EDITED BY VICKY ARNOLD Ernst & Young Professor of Accounting, Dixon School of Accounting, University of Central Florida, USA and Principal Fellow, Department of Accounting and Business Information Systems, The University of Melbourne, Australia
ASSOCIATE EDITORS:
B. DOUGLAS CLINTON Northern Illinois University, USA
ANNE LILLIS University of Melbourne, Australia
ROBIN ROBERTS University of Central Florida, USA
CHRIS WOLFE Texas A&M University, USA
SALLY WRIGHT University of Massachusetts Boston, USA
Amsterdam – Boston – Heidelberg – London – New York – Oxford Paris – San Diego – San Francisco – Singapore – Sydney – Tokyo JAI Press is an imprint of Elsevier
JAI Press is an imprint of Elsevier Linacre House, Jordan Hill, Oxford OX2 8DP, UK Radarweg 29, PO Box 211, 1000 AE Amsterdam, The Netherlands 525 B Street, Suite 1900, San Diego, CA 92101-4495, USA First edition 2007 Copyright r 2007 Elsevier Ltd. All rights reserved No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without the prior written permission of the publisher Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone (+44) (0) 1865 843830; fax (+44) (0) 1865 853333; email:
[email protected]. Alternatively you can submit your request online by visiting the Elsevier web site at http://www.elsevier.com/locate/permissions, and selecting Obtaining permission to use Elsevier material Notice No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. Because of rapid advances in the medical sciences, in particular, independent verification of diagnoses and drug dosages should be made British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN: 978-0-7623-1454-6 ISSN: 1475-1488 (Series)
For information on all JAI Press publications visit our website at books.elsevier.com Printed and bound in the United Kingdom 07 08 09 10 11 10 9 8 7 6 5 4 3 2 1
CONTENTS LIST OF CONTRIBUTORS
vii
REVIEWER ACKNOWLEDGMENTS
ix
EDITORIAL POLICY AND SUBMISSION GUIDELINES
xi
FACTORS POSITED TO INCREASE DEMAND FOR CONTINUOUS ASSURANCE Ronald J. Daigle and James C. Lampe USING ELECTRONIC AUDIT WORKPAPER SYSTEMS IN AUDIT PRACTICE: TASK ANALYSIS, LEARNING, AND RESISTANCE Jean C. Bedard, Michael L. Ettredge and Karla M. Johnstone LIMITED ATTENTION AND INDIVIDUALS’ INVESTMENT DECISIONS: EXPERIMENTAL EVIDENCE Bryan K. Church and Kirsten Ely UNETHICAL FINANCIAL DECISION-MAKING: PERSONAL GAIN VS. CONCERN FOR OTHERS Frank Collins, Oscar J. Holzmann, Suzanne Lowensohn and Michael K. Shaub EFFECTS OF INFORMATION LOAD AND COGNITIVE STYLE ON INFORMATION SEARCH STRATEGIES Charles F. Kelliher and Lois S. Mahoney v
1
29
55
77
101
vi
CONTENTS
AN ASSESSMENT OF THE CONTRIBUTION OF STRESS AROUSAL TO THE BEYOND THE ROLE STRESS MODEL Kenneth J. Smith, Jeanette A. Davy and George S. Everly, Jr.
127
BUDGETARY FAIRNESS, SUPERVISORY TRUST, AND THE PROPENSITY TO CREATE BUDGETARY SLACK: TESTING A SOCIAL EXCHANGE MODEL IN A GOVERNMENT BUDGETING CONTEXT A. Blair Staley and Nace R. Magner
159
THE QUESTION OF DISCLOSURE: PROVIDING A TOOL FOR EVALUATING MANAGEMENTS’ DISCUSSION AND ANALYSIS Lori Holder-Webb
183
LIST OF CONTRIBUTORS Jean C. Bedard
Department of Accountancy, Bentley College, Waltham, MA, USA
Bryan K. Church
College of Management, Georgia Institute of Technology, Atlanta, GA, USA
Frank Collins
College of Business Administration, Texas A&M International University, Laredo, TX, USA
Ronald J. Daigle
Department of Accounting, Sam Houston State University, Huntsville, TX, USA
Jeanette A. Davy
Department of Management, Wright State University, Dayton, OH, USA
Kirsten Ely
School of Business and Economics, Sonoma State University, Rohnert Park, CA, USA
Michael L. Ettredge
School of Business, University of Kansas, Lawrence, KS, USA
George S. Everly, Jr.
Johns Hopkins Center for Public Health Preparedness, Johns Hopkins School of Medicine, USA
Lori Holder-Webb
Department of Accounting and Information Systems, University of Wisconsin-Madison, Madison, WI, USA
vii
viii
LIST OF CONTRIBUTORS
Oscar J. Holzmann
School of Business Administration, University of Miami, Coral Gables, FL, USA
Karla M. Johnstone
Department of Accounting and Information Systems, University of Wisconsin-Madison, Madison, WI, USA
Charles F. Kelliher
Kenneth G. Dixon School of Accounting, University of Central Florida, Orlando, FL, USA
James C. Lampe
School of Accountancy, Missouri State University, Springfield, MO, USA
Suzanne Lowensohn
College of Business, Colorado State University, Fort Collins, CO, USA
Nace R. Magner
Department of Accounting, Western Kentucky University, Bowling Green, KY, USA
Lois S. Mahoney
Department of Accounting and Finance, East Michigan University, Ypsilanti, MI, USA
Michael K. Shaub
Department of Accounting, Texas A&M University, College Station, TX, USA
Kenneth J. Smith
Department of Accounting and Legal Studies, Salisbury University, Salisbury, MD, USA
A. Blair Staley
Department of Accounting, Bloomsburg University of Pennsylvania, Bloomsburg, PA, USA
REVIEWER ACKNOWLEDGMENTS The Editor and Associate Editors at AABR would like to thank the many excellent reviewers who have volunteered their time and expertise to make this an outstanding publication. Publishing quality papers in a timely manner would not be possible without their efforts.
Mohammed Abdolmohammadi Bentley College, USA
Marshall Geiger University of Richmond, USA
Elizabeth Dreike Almer Portland State University, USA
Amy Hageman University of Central Florida, USA
Darrell Brown Portland State University, USA
Julia Higgs Florida Atlantic University, USA Karen L. Hooks Florida Atlantic University, USA
Janne Chung York University, Canada
Susan Hughes Butler University, USA
Ronny Daigle Sam Houston State University, USA
Kathy Hurtt Baylor University, USA
Stan Davis University of TennesseeChattanooga, USA
Lori Kopp University of Lethbridge, Canada
Michael Favere-Marchesi Simon Fraser University, Canada
Ganesh Krishnamoorthy Northeastern University, USA
Leonor Ferreira Universidade Nova de Lisboa, Portugal
J. Randel Kuhn University of Central Florida, USA
Dann Fisher Kansas State University, USA
Tina Loraas Auburn University, USA
Timothy J. Fogarty Case Western Reserve University, USA
Jordan Lowe Arizona State University, USA ix
x
REVIEWER ACKNOWLEDGMENTS
Adam S. Maiga University of Wisconsin-Milwaukee, USA
Steve Salter University of Cincinnati, USA
James Maroney Northeastern University, USA Linda Matuszewski Northern Illinois University, USA Karen Nunez North Carolina State University, USA
DeWayne Searcy Auburn University, USA Doug Stevens Florida State University, USA Scott Summers Brigham Young University, USA Steve Sutton University of Central Florida, USA
Ed O’Donnell University of Kansas, USA
Mark Taylor Creighton University, USA
Troy Paredes Washington University, St. Louis, USA
Jay Thibodeau Bentley College, USA
Mike Power London School of Economics, England
Kristin Wentzel La Salle University, USA Patrick Wheeler University of Missouri, USA
Alan Reinstein Wayne State University, USA
Brett Wilkinson Baylor University, USA
Morina Rennie University of Regina, Canada
Bernard Wong-On-Wing Washington State University, USA
Randall Rentfro Florida Atlantic University, USA
Arnold Wright Boston College, USA
Jay Rich Illinois State University, USA
Jia Wu University of Massachusetts, Dartmouth, USA
EDITORIAL POLICY AND SUBMISSION GUIDELINES Advances in Accounting Behavioral Research (AABR) publishes articles encompassing all areas of accounting that incorporate theory from and contribute new knowledge and understanding to the fields of applied psychology, sociology, management science, and economics. The journal is primarily devoted to original empirical investigations; however, literature review papers, theoretical analyses, and methodological contributions are welcome. AABR is receptive to replication studies, provided they investigate important issues and are concisely written, and is receptive to methodological examinations that can potentially inform future behavioral research. The journal especially welcomes manuscripts that integrate accounting issues with organizational behavior, human judgment/decision making, and cognitive psychology. Manuscripts will be blind-reviewed by two reviewers and an associate editor. The recommendations of the reviewers and associate editor will be used to determine whether to accept the paper as is, accept the paper with minor revisions, reject the paper, or to invite the authors to revise and resubmit the paper.
MANUSCRIPT SUBMISSION Manuscripts should be forwarded to the editor, Vicky Arnold, at
[email protected] via e-mail. All text, tables, and figures should be incorporated into a Word document prior to submission. The manuscript should also include a title page containing the name and address of all authors and a concise abstract. Also, include a separate Word document with any experimental materials or survey instruments. If you are unable
xi
xii
EDITORIAL POLICY AND SUBMISSION GUIDELINES
to submit electronically, please forward the manuscript along with the experimental materials to the following address: Vicky Arnold, Editor Advances in Accounting Behavioral Research Kenneth G. Dixon School of Accounting University of Central Florida P. O. Box 161400 Orlando, FL 32816-1400 References should follow the APA (American Psychological Association) standard. References should be indicated by giving (in parentheses) the author’s name followed by the date of the journal or book; or with the date in parentheses, as in ‘‘suggested by Canada (2005).’’ In the text, use the form Hageman et al. (2006) where there are more than two authors, but list all authors in the references. Quotations of more than one line of text from cited works should be indented and citation should include the page number of the quotation; e.g. (Phillips, 2001, p. 56). Citations for all articles referenced in the text of the manuscript should be shown in alphabetical order in the Reference list at the end of the manuscript. Only articles referenced in the text should be included in the Reference list. Format for references is as follows: For journals Dunn, C. L., & Gerard, G. J. (2001). Auditor efficiency and effectiveness with diagrammatic and linguistic conceptual model representations. International Journal of Accounting Information Systems, 2(3), 1–40. For books Ashton, R. H., & Ashton, A. H. (1995). Judgment and decision-making research in accounting and auditing. New York, NY: Cambridge University Press. For a thesis Smedley, G. A. (2001). The effects of optimization on cognitive skill acquisition from intelligent decision aids. Unpublished doctoral dissertation, University. For a working paper Thorne, L., Massey, D. W., & Magnan, M. (2000). Insights into selectionsocialization in the audit profession: An examination of the moral reasoning of public accountants in the United States and Canada. Working paper: York University, North York, Ontario.
Editorial Policy and Submission Guidelines
xiii
For papers from conference proceedings, chapters from book etc. Messier, W. F. (1995). Research in and development of audit decision aids. In: R. H. Ashton & A. H. Ashton (Eds), Judgment and Decision Making in Accounting and Auditing (pp. 207–230). New York: Cambridge University Press.
This page intentionally left blank
FACTORS POSITED TO INCREASE DEMAND FOR CONTINUOUS ASSURANCE Ronald J. Daigle and James C. Lampe ABSTRACT This study examines factors that may increase the demand for continuous assurance (CA) by individuals making repetitive choice decisions. Business graduate students act as surrogates for financial analysts to study CA demand. Participants choose repeatedly over a multi-period timeframe which of the two companies is the better investment over the entire timeframe. Participants may purchase assured information each period before making their choice for that period. Four experiments manipulate the similarity/dissimilarity of alternatives and the number of reporting periods in an operating cycle for determining their impact on CA demand. Increasing the similarity of alternatives increases CA demand because of increased decision-making complexity, but only when increasing the number of reporting periods. Increasing the number of reporting periods decreases CA demand because of increasing knowledge accumulation over time. CA demand persistence over time is low and erratic in all experiments. While CA demand increases when choices are more similar, overall demand is still relatively low and very inconsistent. Implications of results on the viability of external CA services are discussed.
Advances in Accounting Behavioral Research, Volume 10, 1–28 Copyright r 2007 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1016/S1475-1488(07)10001-6
1
2
RONALD J. DAIGLE AND JAMES C. LAMPE
INTRODUCTION Provision of continuous assurance (CA) services by independent accountants or competitor non-accountants has received considerable attention over the past several years. CA has been discussed as a source of increased revenues for CPAs, a service needing technical development, and a topic in need of objective research. This paper reports the methodology and results of laboratory experiments performed for determining factors that may increase the demand for CA reports by traditional external financial statement users. The AICPA and CICA jointly define CA as (AICPA/CICA, 1999, p. 5): y written assurance on a subject matter, for which an entity’s management is responsible, using a series of auditor reports issued virtually simultaneously with, or a short period of time after, the occurrence of events underlying the subject matter.
As part of its findings, the report recognizes that external CA services (AICPA/CICA, 1999, p. 70): y will never catch on unless decision-makers perceive value in having added assurance on continuous information used in making key decisions regarding an entity.
The report concludes that demand studies are needed to ensure that external CA services are viable. The lack of understanding of CA demand has been cited often as a primary factor hindering CA development, with repeated calls for objective studies on the topic (Vasarhelyi, Alles, & Kogan, 2004; Hunton, Wright, & Wright, 2004; Alles, Kogan, & Vasarhelyi, 2002; Arnold & Sutton, 2001; Kogan, Sudit, & Vasarhelyi, 1999). Demand for CA is studied in an experimental environment of repetitive choice decision-making of investment advising using business graduate students acting as financial analysts. This role is chosen because financial analysts and investors are identified by the AICPA/CICA as likely recipients and users of CA. Prior research, however, has indicated that internal valuation decisions are the type of decision that are most aided by CA and that user demand by external investment decision-makers is low (Daigle & Lampe, 2004; Vasarhelyi, 2002). The primary research question addressed in this study is whether factors, not previously researched, exist and increase the demand for CA by external decision-makers such as financial analysts. A laboratory experiment methodology is applied to determine if greater similarity (complexity) of the information presented about alternative investments or the number of CA reports issued about alternative companies within an operating cycle
Factors Posited to Increase Demand for Continuous Assurance
3
impacts CA demand. Consumer behavior research indicates that as product choice alternatives (such as investment options) become more similar, decision complexity increases, and demand for information on the alternatives also increases (e.g., John, Scott, & Bettman, 1986). Consumer behavior research also indicates that as the number of repeated decision periods increases, the demand for further information about choice alternatives declines (e.g., Simonson, Huber, & Payne, 1988). Experimental participants act as external financial analysts evaluating which of the two organizations is the better investment over a given timeframe (an operating cycle) of repeated decision-making. Participants may purchase assured information before making their decision each period, but are better compensated if they make correct choices without purchasing CA. Two organizations are compared in four experimental settings. The operating results of both organizations are similar to each other in two experiments but dissimilar in two other experiments. Within the same four experiments, two experiments have fewer CA reporting periods than the other two, all over the same given operating cycle. Results indicate that CA demand increases when decision-makers choose between similar alternatives vs. dissimilar alternatives, but only when the number of reporting periods is increased. While a promising result, it is questionable, however, if the amount of increase has a practical impact on the viability of CPAs offering external CA services. Results also indicate that the CA purchase frequency rate decreases substantially when the number of reporting periods is increased. As a further observation, the persistency of purchasing assured operating information over the extended timeframe of repetitive choice decision-making is low and erratic in all four experiments. The overall conclusion of this study is that external choice decision-makers (investors/financial analysts) have low and erratic demand for CA reports on operating results of alternative companies under consideration for investment. The remainder of this paper is as follows: the next section provides background and hypothesis development. Two subsequent sections discuss methodology and data collection/analysis. The final section discusses implications of results on future CA service development.
BACKGROUND AND HYPOTHESIS DEVELOPMENT Although CA has received increased attention since the issuance of the AICPA/CICA research report, CA viability has been formally discussed for
4
RONALD J. DAIGLE AND JAMES C. LAMPE
at least 25 years (Kearns, 1980). Most CA research has focused on its technical aspects (e.g., Alles, Kogan, & Vasarhelyi, 2004; Vasarhelyi et al., 2004; Murthy & Groomer, 2004; Alles, Kogan, & Vasarhelyi, 2003; Noteberg, Benford, & Hunton, 2003; Woodroof & Searcy, 2001; Vasarhelyi, 1998; Halper, Snively, & Vasarhelyi, 1992; Vasarhelyi, Halper, & Ezawa, 1991; Vasarhelyi & Halper, 1991; Groomer & Murthy, 1989; Koch, 1981). Studies of CA demand behavior have been called for (Vasarhelyi et al., 2004; Hunton et al., 2004; Alles et al., 2002, Arnold & Sutton, 2001; AICPA/CICA, 1999; Kogan et al., 1999), but little research to date has attempted to identify factors and their possible influence on CA demand by decision-makers.
Prior Research on CA Demand Daigle and Lampe (2004) investigate the impact of making repetitive valuation vs. choice decisions on CA demand. The valuation vs. choice dichotomy is a common one in decision-making (e.g., Smith, Arnold, & Sutton, 1997; Selart, 1996; Ganzach, 1995; Simonson & Tversky, 1992; Yates, 1990; Billings & Scherer, 1988; Payne, Bettman, & Johnson, 1988; Rothstein, 1986; Hogarth, 1981). Valuation decisions involve estimating specific values and comparing them to a decision standard, while choice decisions involve choosing between alternatives or options as to which is better/worse, greater than/less than, etc. Valuation decisions are discrete and absolute while choice decisions are relative in nature. The primary finding of the prior research is that, at the most basic level, CA demand is significantly greater when decision-makers are faced with repetitive valuation decisions relative to when faced with repetitive choice decisions. Another finding is that increasing the risk of consequence via a decrease in the price for CA increases CA demand when making either type of decision. However, the impact on demand is much greater when making repetitive valuation decisions, and only marginal when making repetitive choice decisions. The study concludes that there is a greater CA demand internal to organizations relative to externally. This finding is consistent with the previous discussion regarding where the initial demand for CA services may begin (Vasarhelyi, 2002). The research findings are inconsistent with the profession’s identification of traditional financial statement users (i.e., external investors, analysts, lenders, etc.) as the primary group for whom CA services should initially be developed. Big 4 audit partners believe that while companies are currently
Factors Posited to Increase Demand for Continuous Assurance
5
resistant to reporting greater amounts of timely and audited information, companies will eventually do so for improving their credibility in the marketplace (Behn, Searcy, & Woodroof, 2006; Searcy, Woodroof, & Behn, 2003). This study, therefore, investigates situations in which CA demand by external investment decision-makers such as financial analysts and investors might be increased. There is no published research on factors that have been posited to increase CA demand by analysts and investors. There has, however, been substantial consumer behavior research on choice decision situations (Dhar, Nowlis, & Sherman, 2000). Because an analyst’s or investor’s choice between investment alternatives is a choice between competing products, results from consumer behavior studies using experimental methodologies provide insights into contextual factors that may influence CA demand by users making repetitive choice decisions.
Choice Decision Context and Consumer Demand for Information Simonson and Tversky (1992) state that consumers tend to focus on the value of each choice alternative in relation to each other rather than absolute values of the alternatives when making a choice. They further posit that because consumers focus on comparing alternatives, changing the choice decision context can increase or decrease the degree of attention (effort) given to making comparisons. As a further consequence, when a consumer’s degree of attention changes, the degree of decision complexity (difficulty) also changes. The relationship between choice decision context and complexity has been the focus of numerous experimental consumer behavior studies. Results show that contextual factors such as increasing the number of choice alternatives (Bhargava, Kim, & Srivastava, 2000; Abdul-Muhmin, 1999; Sivakumar & Cherian, 1995; Simonson & Tversky, 1992), increasing the number of decision attributes (Saad, 1998), changing how information on alternatives is displayed (Coupey, 1994), increasing the similarity of alternatives (Klein & Yadav, 1989; Best & Ursic, 1987; Keller & Staelin, 1987; Malhotra, 1982), and increasing time pressure for making a decision (Dhar et al., 2000; Saad, 1998; Payne et al., 1988) increase consumer choice decision complexity. Increased decision complexity results in changes in decision strategies, an increase in the amount of time taken to make the decision, and inconsistent/inaccurate decision-making. The influence of changing context on consumer choice decision complexity also exists in repetitive choice decision-making environments (Saad, 1998).
6
RONALD J. DAIGLE AND JAMES C. LAMPE
To reduce decision difficulty and improve accuracy, a rational decisionmaker seeks additional information when its marginal value is greater than its marginal cost (e.g., Nelson, 1970; McDonough, 1963; Stigler, 1961). Consumer behavior studies show that the lower a consumer’s certainty of a product’s value, the greater is the desire to acquire information about the product (John et al., 1986; Hagerty & Aaker, 1984; Meyer, 1982). Stated alternatively, prior knowledge can reduce the usefulness of further information because the marginal value of additional information is reduced. With respect to repetitive consumer choice decisions, information obtained for early decisions is carried forward for making subsequent decisions in a sequence regarding the same alternatives, therefore reducing the consumer’s need for further information (e.g., Saad, 1998; Saad & Russo, 1996). Simonson et al. (1988, p. 568) are succinct in describing how further product information may or may not be valuable to consumers by stating: y the acquisition of additional information by consumers when making a decision is expected to be guided by the information the consumer already has about the brands and the degree of certainty felt about that information.
Simonson et al.’s (1988) experimental results support the decreasing net marginal value expectation, as would be expected from the Information Hypothesis. This hypothesis states that information is demanded to reduce risk, improve decision-making, and increase profits. The hypothesis is used to posit that audit (assurance) reports are demanded to help provide these benefits (Wallace, 1980). Synthesizing the results of consumer behavior studies cited and the Information Hypothesis, a change in decision context that increases (decreases) the complexity of repetitive choice decisions is expected to increase (decrease) total demand for information via an assurance service. Besides the importance of determining total demand for CA services, a related aspect is the consistency or persistence of its demand over time. The persistence of demand is important to consider because CA providers are expected to incur large up-front technology costs and efforts to establish such services (Vasarhelyi, 2002; AICPA/CICA, 1999). Furthermore, substantial marginal costs (such as hiring, training, and retaining skilled auditors) will be incurred for generating a periodic CA report. Because CA is to provide information more timely and more often, demand for CA should be sufficient over the extended period of time of its usefulness for its value to exceed its substantial costs. High and persistent demand over time is the optimal scenario for helping successfully establish CA services.
Factors Posited to Increase Demand for Continuous Assurance
7
If demand is low but consistent over time, fewer reports could be generated and thereby reduce the associated costs of making the CA reports available. If demand for CA reports is erratic, those providing CA reports incur higher costs because some of the reports generated will not be acquired for use. Therefore, the importance of studying the persistence of CA demand behavior over an extended timeframe is recognized, but with no theory on which to hypothesize results. CA demand persistence is studied through exploratory analysis and discussed along with results from testing formal hypotheses.
Hypotheses The context of the similarity/dissimilarity of information provided on CA demand patterns is studied for two reasons. First, the essential nature of a choice decision is to discern differences between alternatives and decide which is better/worse, greater than/less than, etc. Second, identifying similarities/dissimilarities is common in a diverse array of decision settings, such as differences in cultural types (Wilkinson, Elahi, & Eidinow, 2003), similarities and differences in consumers visiting a website (Gupta & Mathur, 2002), and similarities and dissimilarities in investment alternatives (Wilcox, 2003). Results from Daigle and Lampe (2004) are based on laboratory experiments where choice decisions are made from comparing operating information of two companies that are very dissimilar. The substantial dissimilarity of information makes the decision relatively simple (less complex). This study extends the prior research by changing the level of similarity of choice decision alternatives. The following result is expected: H1. Increased complexity via similarity of choice alternatives when making repetitive choice decisions increases CA demand. The second major factor studied in this research is the number of CA reports issued during an operating cycle. Consumer research results and the Information Hypothesis both indicate that demand for further information decreases as decision-makers become more certain of the correct decision to be made. Therefore, in a repetitive choice decision-making environment, the frequency of demand for further information should decrease when the number of CA reports made available for decision-making in a timeframe is increased. The cumulative knowledge gained with each additional report should reduce the demand for future CA reports. Stated differently, even if many more CA reports are made available, very few will be purchased.
8
RONALD J. DAIGLE AND JAMES C. LAMPE
Hogarth and Einhorn (1992) provide a model explaining how changing the decision context in a sequential or repetitive decision-making context can influence belief revision. The model has been used often in accounting behavior research (e.g., Trotman & Wright, 1996). While not predicting information demand, the Hogarth and Einhorn model specifies two dimensions associated with the current study – information series numbers and response mode. The number of information announcements in a series may be few (2–12) or many (17 or more). ‘‘Response mode’’ refers to how often decisions are made, ranging from after acquiring each piece of information to after receiving a series of several pieces of information. The model shows that repetitive decisions can be dichotomized as either short or long series of information announcements and information announcements can be acquired individually or in groups before each decision in a sequence. Vasarhelyi (2002) posits that CA may be provided in either a real-time (evergreen) approach or over short intervals following the occurrence of events. The AICPA/CICA defines CA as the serial provision of reports concurrent with or a short time after events of interest occur. CA provided via repetitive available reports soon after event occurrence is more meaningful to study at this time because consistency and persistence of demand over time directly address the long-term viability of accounting professionals providing a new service such as CA. The dichotomy of short (few) vs. long (many) series of information announcements (reports) with a response (decision) after each piece of information (report) is offered in the same total time frame provides a relevant construct for studying CA demand behavior. Prior results from Daigle and Lampe (2004) are based on short series of (few) reports within an operating cycle with a decision after each report. This study extends the prior research by comparing CA demand when both short (few) and long (many) series of reports are made available. Considering this construct with both consumer behavior research cited and the Information Hypothesis, the following results are expected: H2a. When making repetitive choice decisions between similar alternatives, CA demand decreases when more reporting periods exist over a given time frame. H2b. When making repetitive choice decisions between dissimilar alternatives, CA demand decreases when more reporting periods exist over a given time frame.
Factors Posited to Increase Demand for Continuous Assurance
9
METHODOLOGY Four laboratory experiment settings are used to test the two hypotheses. Each of the four settings is explained as follows: DF – repetitive choice decisions, selecting between two DISSIMILAR organizations in a cycle having FEWER (9) CA reporting periods, DM – repetitive choice decisions, selecting between two DISSIMILAR organizations in a cycle having MORE (20) CA reporting periods, SF – repetitive choice decisions, selecting between two SIMILAR organizations in a cycle having FEWER (9) CA reporting periods, and SM – repetitive choice decisions, selecting between two SIMILAR organizations in a cycle having MORE (20) CA reporting periods.
Description of Repetitive Choice Decisions Participants in each setting assume the role of a financial analyst in a multiperiod cycle. Participants are not told the length of the period or the cycle, but only that multiple periods require repetitive decisions in a given operating cycle. A period could be one day, one month, one quarter, or any other timeframe of an operating cycle with a given number of decision periods within the cycle. Participants are informed that the current and further cycles involve the comparison of two new organizations unrelated to those in previous cycles. The organizations compared in each cycle are identical except for production process efficiency. One to four control errors can occur in each organization’s production process in any given period, with errors reducing production process efficiency. The organization with the fewest errors (lowest cost) for the entire cycle is more efficient and profitable. The financial analyst’s task is to determine which company is the more profitable one to recommend for investment. The criterion of control errors is used because control data (whether operational or financial) has been identified as a likely subject matter for CA (Vasarhelyi, 2002). While other criteria could be used, control errors provide a realistic CA reporting scenario. The number of errors that can occur in a period is controlled by a fixed probability of occurrence. Two probability distributions (one for each organization per cycle) are described and presented to participants. The primary task of the surrogate financial analysts is to determine the probability distribution of errors associated with each of the two alternative production processes. Table 1 shows the sets of
10
RONALD J. DAIGLE AND JAMES C. LAMPE
Table 1. Organization
Production Control Error Probabilities per Period. Probability of 1 Control Error
Probability of 2 Control Errors
Probability of 3 Control Errors
Probability of 4 Control Errors
Panel A – DF and DM settings #1 10% #2 5%
80% 5%
5% 80%
5% 10%
Panel B – SF and SM settings #1 30% #2 20%
25% 25%
25% 25%
20% 30%
DF: Dissimilar organizations with fewer reporting periods; DM: Dissimilar organizations with more reporting periods; SF: Similar organizations with fewer reporting periods; SM: Similar organizations with more reporting periods.
probability scenarios for the different settings. In both the Panel A and Panel B production process scenarios, Scenario #1 generates fewer errors over the entire cycle but either of the two scenarios may have a greater number of errors in any one period. A second aspect of Table 1 Panels A and B is the similarity/dissimilarity of the probability distributions. Panel A illustrates probability distributions that are very dissimilar and result in greater differences of total errors occurring through the cycle. Panel B contains very similar probability distributions that result in smaller differences in the total number of errors occurring through the entire cycle. CA reports for the companies with Panel A distributions contain dissimilar information about the two companies, while the CA reports for the two companies with Panel B distributions contain similar information. Because the probability distributions are dissimilar in both DF and DM, but similar in both SF and SM, results provide the ability to test H1. Hogarth and Einhorn (1992) conclude that differences exist in decision behavior between series with 12 or fewer decisions vs. 17 or more decisions. The difference in the number of periods between both DF and SF (12 3=9) and DM and SM (17+3=20) has been designed to clearly represent a fewer set of reporting periods and a greater set. This provides the ability to test H2a (SF vs. SM) and H2b (DF vs. DM). Participants know that the two organizations they are comparing (referred to by letters, such as A and B) have different control error probabilities. They do not know which production process scenario is assigned to which organization in a cycle, but make a period-by-period
Factors Posited to Increase Demand for Continuous Assurance
11
decision of whether to purchase CA reports that indicate the number of errors encountered by each organization each period through the period of purchase. In each period participants also choose the organization that they believe will be more efficient over the entire cycle and is, therefore, the better investment. The first scenario in each panel of Table 1 is more efficient because the smaller mean probability of error occurrence per period results in fewer errors over the entire cycle, even though the less efficient organization may appear to be more efficient in any one time period. The organization with the first process is therefore the correct choice each period of a cycle. The task of making repetitive choice decisions about the better of the two companies represents the most typical repetitive decisions facing the traditional financial statement audit users targeted as the likely primary users of CA services – analysts and investors.
Assurance Reports and Experimental Tension Participants can (1) earn lab dollars of reward each period when making the correct choice; (2) incur a penalty of lab dollars each period when making the wrong choice; or (3) abstain, with no rewards earned or penalties incurred for that period. After the 9th/20th (few/many report settings) reporting period in a cycle, one last period occurs. Participants receive a free report at the end of the final period, detailing by organization and by period the precise number of errors that occurred during the cycle. The free report may be thought of, for example, as similar to publicly available interim review reports on the condensed financial statement information of alternative companies at the end of a quarter. Because the report is free and provides perfect information, no rewards are earned or penalties incurred in the final 10th/21st period of the cycle. These are the only free reports provided in the cycle. Participants may purchase assurance in some, all, or none of the periods before the final period in a cycle at a price based on the ‘‘mean price of indifference’’ within the given setting.1 Assurance reports disclose the precise number of errors by period and organization through the period of purchase. Precise information is provided because experimental findings indicate that more precise reports are demanded more often than less precise reports when both provide equal net economic benefit (Arnold, Lampe, Masselli, & Sutton, 2000). Because CA methods potentially allow entire data populations to be tested (Alles et al., 2002), precise information in CA reports provides a realistic experimental scenario.
12
RONALD J. DAIGLE AND JAMES C. LAMPE
Participants receive a lab dollar budget each cycle that provides the ability to purchase assurance in any and every decision period at the mean price of indifference, if desired. Note that participants pay directly for assurance, as opposed to the traditional financial statement audit that is paid for by the auditee. Direct payment for CA services by users, either on a ‘‘pay as you go’’ basis or a subscription upfront, has been proposed (Alles et al., 2002). Whether paid for by users on a ‘‘pay as you go’’ basis or by the organization providing the continuous information, ultimate demand for assurance reports is determined by the end user. A ‘‘pay as you go’’ method is used because it allows a direct testing of each participant’s demand for and use of CA. Participants are better compensated if correct decisions are made without purchasing reports. Each participant earns a $5 base for participating in one of the four experiments and can earn up to $15 more based on total rewards earned less total penalties incurred plus unspent budget in proportion to the most lab dollars accumulated by a participant in their particular experimental session. The participant who accumulates the most lab dollars in a session (from making the most correct and fewest incorrect choice decisions with the fewest report purchases) earns an additional $15 merit pay (total $20 compensation to the participant). All remaining participants in the same session earn a proportionate amount of the merit pay based on their lab dollar accumulations in relation to the participant with the greatest accumulation. Efficient purchasing of CA reports while making correct choices helps maximize lab dollars accumulated. Both the pricing of reports at the mean price of indifference and the competitive monetary payoff method provide experimental ‘‘tension’’ (i.e., motivation) for participants to purchase assurance only when it is perceived to help improve both choice decision accuracy and profits. With matched settings differing respectively by the dissimilarity/similarity of choices and the number of reporting periods, a setting’s CA purchasing frequency rate indicates the relative demand for CA and are compared to test both hypotheses.
DATA COLLECTION AND ANALYSIS Data have been collected from graduate business student participants in four separate sessions at the same university – one session for each laboratory experiment setting. A pilot-test of DF using graduate business students at the same university has confirmed the propriety of using this subject pool. Each session is conducted during normal class times provided
Factors Posited to Increase Demand for Continuous Assurance
13
by course instructors with voluntary student participation. The researchers are not the instructors of record in any of the four courses, and neither CA nor choice decision-making is a curriculum topic in any of the courses.2 Students receive written instructions one class period prior to the experimental session for the setting in which they will participate. Random assignment of error probability scenarios and generation of error data are completed prior to each session. Each session follows a similar set of activities, including a three-period learning session before the actual experiment. Most participants appear to have benefited from the advance materials and learning sessions and have demonstrated understanding of their setting and decisions required.3 Table 2 summarizes key descriptive statistics for each setting, with emphasis on the mean frequency rates of purchasing assurance. This rate is computed by dividing the number of assurance purchases by the number of times assurance could have been purchased, averaged across all participants in each respective setting. Table 3 summarizes results of logistical regression analyses of the rates for testing both H1 (similarity/dissimilarity of alternatives) and H2 (availability of few/many assurance reports). Results are also presented graphically in Figs. 1–4. Participants in each setting make the following decisions in each new reporting period: (1) whether to purchase CA reports and (2) which organization is more efficient and, therefore, the better recommended investment over the cycle. Participants in DF and SF make 9 sets of decisions in each of two cycles (18 decisions total) vs. 20 sets of decisions made in one cycle by participants in the DM and SM settings.4 In statistical terminology this is a repeated measures exercise – each participant makes identical choice decisions based on similar information in repeated observations. A potential problem in repeated measures studies is that participants may not provide reliable decisions in each of the repeated decision situations. Some type of repeated measure association between observations may occur because of inattention due to boredom or familiarity with the repetitive information made available. To test for potential repeated measures effects, alternating logistic regression analysis (ALR) is used to analyze the purchase decision data.5 In most basic terms, the ALR analysis procedure (in the GENMOD procedures of SAS) tests for association between the repeated observations for each subject and estimates a correction for observations deemed to be outside constraints based on means of prior decisions (Carey, Zeger, & Diggle, 1993). Table 2 also shows each setting’s mean rate of purchase persistence (MRPP). This rate is computed by comparing each participant’s purchase
14
RONALD J. DAIGLE AND JAMES C. LAMPE
Table 2. Descriptive Participant Statistics. Descriptive Statistic
DF – Dissimilar SF – Similar DM – Dissimilar SM – Similar Choices with 9 Choices with 9 Choices with 20 Choices with 20 Reports per Reports per Reports per Reports per Cycle Cycle Cycle Cycle
Sample size 31 30 27 28 Mean years of work 1.77 2.60 1.18 1.04 experience Mean frequency rate 22.76% 25.19% 9.44% 18.93% of purchasing assurance Standard deviation of 0.1186 0.1278 0.0610 0.0762 frequency rate of purchasing assurance Mean rate of purchase 3.63% 5.63% 1.36% 2.26% persistence Standard deviation of 0.0718 0.0826 0.0375 0.0463 mean rate of purchase persistence Mean decision 80.83%/72.13% 67.69%/53.53% 92.16%/93.53% 74.53%/84.14% accuracy rate after purchasing assurance/no assurance $15.42 $14.00 $18.70 $16.07 Mean monetary payoff DF: Dissimilar organizations with fewer reporting periods; DM: Dissimilar organizations with more reporting periods; SF: Similar organizations with fewer reporting periods; SM: Similar organizations with more reporting periods.
decision with the decision made in the preceding period (8 comparisons are made per cycle in both DF and SF while 19 are made in both DM and SM), averaged across all setting participants. While not used to test either hypothesis, MRPPs provide insights into the persistency of demand over time in a setting. Table 2 also shows decision accuracy rates of participants in periods after CA is purchased vs. when not purchased. Prior consumer behavior studies cited report that changing a choice decision context can change decision accuracy because of a change in decision complexity. Decision accuracy has also been shown to be a direct measure
Factors Posited to Increase Demand for Continuous Assurance
Table 3.
15
Alternating Logistical Regression Results. Z-Statistic
Hypothesis
One-Tailed p-Value
H1: Comparison of mean frequency rates of purchasing assurance between DF and SF DM and SM
0.65 4.95
0.2966 o0.0001
H2: Comparison of mean frequency rates of purchasing assurance between DF and DM SF and SM
6.46 2.95
o0.0001 0.0082
DF: Dissimilar organizations with fewer reporting periods; DM: Dissimilar organizations with more reporting periods; SF: Similar organizations with fewer reporting periods; SM: Similar organizations with more reporting periods.
50% DF SF
45%
% Purchase Frequency
40% 35% 30% 25% 20% 15% 10% 5% 0% 1
2
3
4
5 Period
6
7
8
9
DF: Dissimilar organizations with fewer reporting periods SF: Similar organizations with fewer reporting periods
Fig. 1.
Period-by-Period Frequency Rates of Purchasing Assurance in DF and SF.
of decision complexity when making accounting and audit decisions (e.g., Chang, Ho, & Liao, 1997; Bonner, 1994). Decision accuracy rates, therefore, provide evidence of the relative decision complexity of experimental settings.
16
RONALD J. DAIGLE AND JAMES C. LAMPE
50% DM SM
45%
% Purchase Frequency
40% 35% 30% 25% 20% 15% 10% 5% 0% 1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 Period
DM: Dissimilar organizations with more reporting periods SM: Similar organizations with more reporting periods
Fig. 2.
Period-by-Period Frequency Rates of Purchasing Assurance in DM and SM.
Results of Hypothesis Testing H1 is tested twice, with comparisons of the mean frequency rates of purchasing assurance between both DF vs. SF and DM vs. SM. When fewer CA reports are made available per cycle, no significant difference in purchase frequency is noted between DF and SF (22.76% vs. 25.19%, p-value of 0.2966 in Table 3). When more CA reports are made available per cycle, the purchase frequency rate in SM (18.93%) is significantly greater than that in DM (9.44%) with a p-valueo0.0001. H1 is therefore accepted only in the many reports (SM/DM) setting. The mixed results from testing H1 emphasize the likely influence of the number of reporting periods on CA demand, which is tested in H2. For dissimilar choice alternatives, logistic regression results indicate that the mean frequency rate of purchasing assurance declines significantly from DF to DM (22.76% vs. 9.44% p-valueo0.0001). For similar choice alternatives, the purchase frequency rate also declines significantly from SF to SM (25.19% vs. 18.93%, p-value of 0.0082). Both H2a and H2b are accepted,
Factors Posited to Increase Demand for Continuous Assurance
17
50% DF DM
45% % Purchase Frequency
40% 35% 30% 25% 20% 15% 10% 5% 0% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Period DF: Dissimilar organizations with fewer reporting periods DM: Dissimilar organizations with more reporting periods
Fig. 3.
Period-by-Period Frequency Rates of Purchasing Assurance in DF and DM.
with the conclusion that the CA purchase frequency rate is greater, regardless of the similarity/dissimilarity of choice alternatives, when fewer reports are provided.
Graphical Representations of Purchase Behavior Besides computing differences in purchase frequency rates, Figs. 1–4 provide visual evidence of differences in CA demand behavior between matched experiments. While mean frequency rates do not differ significantly between DF and SF, Fig. 1 indicates a clearly different purchase pattern for CA reports with similar vs. dissimilar choices. Decision-makers in DF make most of their purchases in the first half of the cycle, with purchases declining steadily from the 3rd through 9th periods. This decline is likely due to participants discerning early on in the cycle as to which organization is the better choice, therefore reducing their frequency of purchasing assurance in later periods. In contrast, demand for CA in the SF experiment (see Fig. 1) continues into the second half of
18
RONALD J. DAIGLE AND JAMES C. LAMPE
0.45 SF SM
% Purchase Frequency
0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Period SF: Similar organizations with fewer reporting periods SM: Similar organizations with more reporting periods
Fig. 4.
Period-by-Period Frequency Rates of Purchasing Assurance in SF and SM.
the cycle with an overall increase in purchasing from the 3rd through 7th periods. The difference between the two experiments in the timing of when assurance is purchased in the cycle is likely due to the more complex choice decision of choosing between more similar alternatives in SF. Consumer behavior research and accounting and audit research (e.g., Chang et al., 1997; Bonner, 1994) predict lower decision accuracy as decisions become more complex. In this study, higher decision accuracy rates occur in DF relative to SF after both purchasing and not purchasing assurance (80.83/72.13% vs. 67.69/53.53% in Table 2). These results indicate that participants in DF are making a less difficult (less complex) repetitive decision than participants in SF. The difference in decision complexity is also reflected in mean payoffs per participant, which are also shown in Table 2. The reader is reminded that participants within a particular session are compensated for their decisions in relation to the participant that accumulates the most lab dollars. The higher decision accuracy rate among DF participants explains why they earned a higher mean payoff than SF participants ($15.42 vs. $14.00).
Factors Posited to Increase Demand for Continuous Assurance
19
As a further observation, there is little difference between the MRPPs of DF and SF (3.63% vs. 5.63% in Table 2). It may be noted in Fig. 1 that the persistence of CA demand is approximately the same across the cycle in both DF and SF, only differing in which part of the cycle (early for DF and later for SF) CA demand mainly occurs. Fig. 2 visually illustrates that, in the many reports setting, assurance purchasing in DM declines rapidly early in the cycle and stays low thereafter. In contrast, assurance purchases in SM continue on throughout more of the cycle, although with a declining seesaw pattern of peaks and valleys of purchasing. Higher decision accuracy rates in DM than in SM (92.16/93.53% vs. 74.53/84.14% in Table 2), coupled with a higher mean payoff ($18.70 vs. $16.07 in Table 2) again indicate that choosing between similar alternatives is a more complex decision. With consideration of the greater number of periods in these two settings than in DF and SF, decision accuracy rates explain the greater CA demand in SM than in DM. While demand frequency may be greater in SM, MRPPs are similar and very low in both DM (1.36%) and SM (2.26%). This indicates that CA demand is inconsistent when the number of reports provided for repetitive choice decision-making is increased, regardless of similarity/dissimilarity of alternatives. Figs. 3 and 4 compare purchasing behavior in the few reports vs. many reports settings for both dissimilar and similar choice alternatives, respectively. Both figures show that increasing the number of reports available within a cycle leads to a decline in CA purchase frequency in later periods. In Fig. 3, little demand for assurance exists after approximately four or five reporting periods in both DF and DM because, as noted by decision accuracy rates, participants appear to have become certain early in the cycle as to which is the correct choice between dissimilar alternatives without need for further information. While the MRPP in DF (3.63%) is marginally greater than that in DM (1.36%), each setting’s low MRPP confirms the lack of demand persistence when making repetitive choice decisions with dissimilar alternatives. Fig. 3 visually shows that the difference in the number of periods between DM and DF influences CA purchase frequency to be lower in DM (9.44%) than in DF (22.76%). This is shown by the low to non-existent demand after the 5th period. Although demand in both DM and DF is similarly low after the 6th period, the purchase frequency rate of DM is much lower because more reports are made available but few are purchased. Note that DM participants are somewhat more accurate after not purchasing assurance (93.53%) than after purchasing assurance (92.16%). These results are consistent with previously cited consumer behavior
20
RONALD J. DAIGLE AND JAMES C. LAMPE
research. The reader is reminded that as a consumer choice decision increases in repetition, the need for further information declines because prior information is useful for making further decisions in the sequence. A review of the detail of choice decisions by period shows that DM and DF participants successfully identify the better organization by the time most CA purchase activity ceases (around the fifth period). DM participants have higher decision accuracy rates, as well as higher payoff rates after not purchasing assurance than after purchasing assurance. The higher accuracy and payoff rates occur because participants become certain of the correct choice early in the cycle and have more subsequent periods when they do not purchase assurance, but still make repetitively accurate choice decisions. Fig. 4 shows that increasing the number of reports during the cycle leads to a discernable pattern of demand in a similar choice decision setting. A cyclic pattern of a purchasing peak followed by a valley of few purchases is repeated through the first half of the cycle when many reports are available. An overall pattern of decreased purchasing in later periods of the many reports setting occurs because decision-makers have increased knowledge accumulation from previous purchases. Similar to the comparison of DF to DM, both decision accuracy rates and mean payoffs in the SF SM settings indicate that the decision in SF is more difficult, as hypothesized (67.69/53.53% and $14.00 for SF vs. 74.53/84.14% and $16.07 for SM). Like DM participants and consistent with consumer behavior research, SM participants are also more accurate after not purchasing assurance (84.14%) than after purchasing assurance (74.53%). This result occurs because of the greater number of periods of repetitious decision-making in the SM setting, thereby allowing participants to accumulate knowledge from earlier CA reports and make repetitive accurate choices in later periods. As a final comment regarding Fig. 4, a comparison of MRPPs for SF (5.63%) and SM (2.26%) shows that having fewer reporting periods increases the consistency of CA demand, a result consistent with the comparison of DF and DM. However, the rate of MRPP (5.63%) is still very low and on a practical basis indicates inconsistent demand for CA. Demand from one period to the next is inconsistent in all of the four experimental settings.
IMPACT OF STUDY The purpose of this research is to determine if either increases in either the similarity of information (more complex decisions) about alternative
Factors Posited to Increase Demand for Continuous Assurance
21
investment options or the number of CA reports made available in a given operating cycle would influence demand for CA reports by repetitive choice decision-makers. These two factors are studied in a repetitive choice decision-making context in which the choices are investment alternatives because such decisions are commonly made by certain traditional financial statement audit users (financial analysts/investors) that have been identified by the AICPA/CICA as the primary targeted consumers of external CA services. Three key results are found. The results indicate that CA demand increases when decision-makers choose between similar alternatives vs. dissimilar alternatives, but only when the number of reporting periods is increased. This result occurs because the increased complexity in discerning differences between similar choice alternatives leads decision-makers to demand information via CA for improving choice decision accuracy. The increase, though statistically significant, is still a relatively small increase on a low base. The reader is reminded that the purchase frequency rates for similar vs. dissimilar information reports (SM vs. DM) are 18.9% vs. 9.4%. While the difference is statistically significant, a more practical question is if free market purchases of less than 20% of available reports (less than 10% when the information presented is dissimilar) is adequate to justify CA report preparation. Our second result indicates that the frequency rate of CA purchasing decreases when the number of reporting periods increases. This occurs because repetitive decision-makers accumulate information from previous CA reports that allow near certainty of making the correct choice between alternatives without needing additional reports. The more dissimilar the information about alternatives, the sooner the decision-maker becomes confident of the better company. Increasing the number of CA reports made available to decision-makers does not substantially increase the total number of reports demanded by choice decision-makers. Instead, it significantly reduces the overall frequency rate of CA demand. Our third key research result is that CA demand is erratic in all experimental settings. In settings where dissimilar choice decisions are made, CA demand peaks early in the repetitive decision-making process, and then becomes almost nonexistent. In settings where choice decisions are made, based on similar information about the alternatives, CA demand is relatively greater in later reporting periods but is still low and not sufficiently consistent to produce a stable CA market exists. Results, therefore, provide mixed findings, although the demand for CA is greater both as choice alternatives become more similar and when the number of reporting periods decreases,
22
RONALD J. DAIGLE AND JAMES C. LAMPE
total demand is low on a practical basis and not sufficiently consistent for incurring substantial marginal costs to provide CA reports every decisionmaking period. Neither increasing the similarity of information, nor increasing the number of reports made available in an operating cycle provides either high or persistent demand for CA when making repetitive choice decisions. Our results are potentially beneficial for both practice development and guidance for future research. The profession should seriously question the development of CA primarily as an external service for choice decisionmakers. The upfront costs for implementing CA services are substantial because of the technology involved (Vasarhelyi, 2002; AICPA/CICA, 1999). High fixed up-front costs, continuing marginal costs for report generation, and low and inconsistent demand from external choice decision-makers combine to indicate that extreme care should be taken in a decision to provide CA services for external investment decision-makers. Internal decision-makers have been noted as a potential group that will more likely demand and benefit from CA (Vasarhelyi, 2002), especially as the SarbanesOxley Act of 2002 continues, on an increasing basis, to make key executives within publicly traded companies to be personally liable for the reliability of both control and financial information presented in public filings. With decision consequence increasing CA demand (Daigle & Lampe, 2004), internal management of publicly traded companies is a likely primary target for internal CA services. More research is needed for understanding the demand for CA services. Prior consumer behavior research has identified a number of factors involved in choice decisions that increase decision-making complexity, such as the number of alternatives and attributes of importance. These factors, therefore, are also likely to influence the demand for CA information. Studying such factors individually and collectively in a repetitive decisionmaking environment may provide insights as to whether a market for CA services is (or is not) formed and sustained over an extended timeframe.
NOTES 1. The ‘‘mean price of indifference’’ is based on the difference in the economic value between the most risk-averse behavior (desiring assurance every period) and the most likely rational risk-taking behavior (making the best guess based on prior information known and without use of assurance) based on the facts in a given setting. The error range, pair of process scenarios, number of periods in which
Factors Posited to Increase Demand for Continuous Assurance
23
assurance can be purchased, and reward/penalty structure are used to calculate the value of a CA report under the most risk-averse behavior in each setting. An assurance report becomes more reliable with each period because more cumulative information is revealed each period. The reward/penalty structure (300 lab dollar reward/200 lab dollar penalty per period) is considered with the cumulative probabilities of report accuracy for computing the total number of lab dollars that would be accumulated through a cycle in a given setting. After purchasing CA, the logical choice decision as to which organization is the better investment over the cycle is the organization that has fewer cumulative control errors in the report. The total number of lab dollars accumulated less the cost of purchasing CA each period is the value of assurance every period. The value in lab dollars from following the most likely rational risk-taking behavior is then calculated based on the best guessing strategy in a given setting. The best guessing strategy in all settings is to choose one organization half the time as being better and choose the other as being better half the time. The expected number of lab dollars accumulated from the best guess strategy is then subtracted from the expected number of lab dollars accumulated from using assurance every period. Dividing this difference by the number of assurance purchasing periods provides the mean price of indifference for assurance per period. The lab dollar budget provided each participant is the expected dollars needed for correcting the control errors plus the mean price of indifference multiplied by the periods in which reports are available (9/20 for the few/many settings). Setting
DF SF DM SM
Mean Price of Indifference per Period
Cycle Budget
225 50 240 75
2,025 450 4,800 1,500
Participants are given the budget at the beginning of each cycle and have total discretion over its use. With these budgets, pursuing either the most risk-averse strategy or most risk-taking behavior mathematically results in the same net economic outcome over a cycle in a given setting. Both of these strategies can be improved upon by participants who determine and act upon the given process scenarios. 2. Instructors are thanked for allowing access to student participants during scarce class time. 3. The SM experiment has also been conducted using 30 graduate business students at another university, with all data collected in a single class session. The researchers are not the instructor, nor are CA or choice decision-making covered, in the course. Behavior in the alternative experiment is completely consistent with the behavior in the SM experiment reported in later sections of this study, including period-by-period assurance purchase behavior shown in Figs. 2 and 4. No significant
24
RONALD J. DAIGLE AND JAMES C. LAMPE
difference is found between the mean frequency rate of purchasing assurance in the SM experiment reported in this study (18.93%, as shown in Table 2) and the alternative experiment (17.00%). Interpretations of, as well as conclusions reached from, both assurance purchase behavior and decision accuracy data in the alternative experiment are the same as those based on the SM data included and discussed in later sections of this study. One noted difference between the two SM experiments is mean years of professional work experience. Participants in the alternative experiment have 4.73 years experience vs. 1.04 years experience for participants in the SM experiment reported in this study. No significant differences in experience are found between the four experiments reported in this study (lowest two-tailed p-value is 0.1254 (SF vs. SM), with all others >0.28). Experience differences exist, however, between the alternative SM experiment and all experiments reported in this study (all two-tailed p-valueso0.082). The alternative SM experiment, therefore, fails to show that factors such as university, instructor, work experience, and collecting all data for a setting using students in a single classroom session influences this study’s results and conclusions. Instead, the additional data collected indicates strong validity for how the experiments have been designed and conducted. Participating graduate students are, therefore, the best available source of knowledgeable and rational decision-makers on purchasing CA. 4. DF and SF experiments were conducted first. Analysis of behavior in DF and SF indicates a second cycle was unnecessary because behavior in each experiment is consistent across the two cycles. Therefore, only one cycle was conducted in the DM and SM experiments, performed subsequent to DF and SF. 5. The primary approach to the General Estimating Equation (GEE) for the Alternating Logistical Regression Analysis (ALR) is to model the association between pairs of responses with log odds ratios. Per the GENMOD Procedure description in the online SAS manual, ‘‘correlation is constrained to be within limits that depend, in a complicated way, on the means of the binary data.’’ Modeled as a vector of regression parameters, the output defines subgroups within clusters or ‘‘block effects’’ between clusters in a multistage analysis. Clusters are fully parameterized for each unique pair. When analyzing the DF and SF settings data, a total of 153 parameters are defined in the total vector. For the DM and SM settings, 20 decisions in one cycle results in 190 parameters in the total vector. The basic vector is 18 2 for the nine decisions in each of two cycles. The complete vector contains [(n(n 1)/2] clusters in the form of a Z matrix. Because the data are binary (yes/no for purchase of CA) only one pair of observations are made available. In the first phase of processing, all of the pairs are analyzed as a Z matrix [(1,2), (1,3) y (1.18), (2.3), (2,4) y (2,18), y (17,18)]. The model constrains the observations by log odds ratios constraints within each cluster. The first phase of processing estimates the association between clusters (model based covariance estimate) by setting up a contrast matrix based on the log odds ratio constraints that are tested for agreement with the original matrix. The p-value for the association is computed based on the chi-square distribution. Output from the estimate is labeled ‘‘Alpha 1’’ and is printed with the original (first run) standard error estimate. The regression is run a second time with the constrained values resulting in a standard
Factors Posited to Increase Demand for Continuous Assurance
25
error estimate somewhat less probable to be significant. Appropriate output for all tests is: Alpha 1 Probability DF vs. SF Initial estimate GEE DM vs. SM Initial estimate GEE DF vs. DM Initial estimate GEE SF vs. SM Initial estimate GEE a
Z-Statistic
0.089
Probability
0.5383 0.5933a
0.645
o0.0001 o0.0001
0.435
o0.0001 o0.0001
0.091
0.0110 0.0164a
Two-tail probabilities are divided by two for presentation as one-tail probabilities in Table 3.
The overall conclusion is that, based on ALR analysis, very little association (covariance) is present in the DF, DM, and SM settings. Some association between responses is found to exist in the SF setting, but is not significant and correction of the association via the alternating logistical regression estimate has little affect on the probabilities of significant differences between mean purchase rates. Simple t-test results are not substantially different because no significant associations are found between clusters and corrected for in subsequent runs.
ACKNOWLEDGMENTS The authors thank Vicky Arnold and two anonymous reviewers for helpful suggestions for improving the study. The authors also thank James Carver and Mike Wittmann for helpful suggestions with the literature review and theoretical development. The authors also thank Ron Bremer for helpful assistance with statistical analysis. The authors also thank participants at the 2004 Annual Continuous Auditing and Reporting Symposium and 2005 IS Mid-Year Meeting for helpful suggestions on early drafts of the study.
REFERENCES Abdul-Muhmin, A. G. (1999). Contingent decision behavior: Effect of number of alternatives to be selected on consumers’ decision processes. Journal of Consumer Psychology, 8(1), 91–111.
26
RONALD J. DAIGLE AND JAMES C. LAMPE
Alles, M. G., Kogan, A., & Vasarhelyi, M. A. (2002). Feasibility and economics of continuous assurance. Auditing: A Journal of Practice and Theory, 21(1), 125–138. Alles, M. G., Kogan, A., & Vasarhelyi, M. A. (2003). Black box logging and tertiary monitoring of continuous assurance systems. Information Systems Control Journal, 1, 37–39. Alles, M. G., Kogan, A., & Vasarhelyi, M. A. (2004). Restoring auditor credibility: Tertiary monitoring and logging of continuous assurance systems. International Journal of Accounting Information Systems, 5(2), 183–202. American Institute of Certified Public Accountants (AICPA) & Canadian Institute of Chartered Accountants (CICA) Study Group. (1999). Continuous auditing. New York, NY: AICPA and CICA. Arnold, V., Lampe, J. C., Masselli, J., & Sutton, S. G. (2000). An analysis of the market for systems reliability assurance services. Journal of Information Systems, 14(Supplement), 65–82. Arnold, V., & Sutton, S. G. (2001). The future of behavioral accounting (information systems) research. Advances in Accounting Behavioral Research, 4, 141–153. Behn, B. K., Searcy, D. L., & Woodroof, J. B. (2006). A within firm analysis of current and expected future audit lag determinants. Journal of Information Systems, 20(1), 65–86. Best, R. J., & Ursic, M. (1987). The impact of information load and variability on choice accuracy. Advances in Consumer Research, 14(1), 106–108. Bhargava, M., Kim, J., & Srivastava, R. K. (2000). Explaining context effects on choice using a model of comparative judgment. Journal of Consumer Psychology, 9(3), 167–177. Billings, R. S., & Scherer, L. L. (1988). The effects of response mode and importance on decision-making strategies: Judgment versus choice. Organizational Behavior and Human Decision Processes, 41(1), 1–19. Bonner, S. E. (1994). A model for the effects of audit task complexity. Accounting, Organizations and Society, 19(3), 213–234. Carey, V., Zeger, S. L., & Diggle, P. (1993). Modeling multivariate binary data with alternating logistic regression. Biometrica, 80(3), 517–526. Chang, C. J., Ho, J. L. Y., & Liao, W. M. (1997). The effects of justification, task complexity and experience/training on problem-solving performance. Behavioral Research in Accounting, 9(Supplement), 98–116. Coupey, E. (1994). Restructuring: Constructive processing of information displays in consumer choice. Journal of Consumer Research, 21(1), 83–99. Daigle, R. J., & Lampe, J. C. (2004). The impact of the risk of consequence on the relative demand for continuous online assurance. International Journal of Accounting Information Systems, 5(3), 313–340. Dhar, R., Nowlis, S. M., & Sherman, S. J. (2000). Trying hard or hardly trying: An analysis of context effects in choice. Journal of Consumer Psychology, 9(4), 189–200. Ganzach, Y. (1995). Attribute scatter and decision outcome: Judgment versus choice. Organizational Behavior and Human Decision Processes, 62(1), 113–122. Groomer, S. M., & Murthy, U. S. (1989). Continuous auditing of database applications: An embedded audit module approach. Journal of Information Systems, 3(1), 53–69. Gupta, A., & Mathur, A. (2002). Adaptive delivery of e-commerce web sites. Intelligent Data Analysis, 6, 469–480. Hagerty, M. R., & Aaker, D. A. (1984). A normative model of consumer information process. Marketing Science, 3(3), 227–246.
Factors Posited to Increase Demand for Continuous Assurance
27
Halper, F. B., Snively, J., & Vasarhelyi, M. A. (1992). The continuous process audit system: Knowledge engineering and representation. EDPACS, 20(4), 15–22. Hogarth, R. M. (1981). Beyond discrete biases: Functional and dysfunctional aspects of judgmental heuristics. Psychological Bulletin, 90(2), 197–217. Hogarth, R., & Einhorn, H. (1992). Order effects in belief updating: The belief-adjustment model. Cognitive Psychology, 24(1), 1–55. Hunton, J. E., Wright, A. M., & Wright, S. (2004). Continuous reporting and continuous assurance: Opportunities for behavioral accounting research. Journal of Emerging Technologies in Accounting, 1, 91–102. John, D. R., Scott, C. A., & Bettman, J. R. (1986). Sampling data for covariation assessment: The effect of prior beliefs on search patters. Journal of Consumer Research, 13(1), 38–47. Kearns, J. (1980). Are we ready for continuous process auditing? CA Magazine, 13(September), 68–71. Keller, K. L., & Staelin, R. (1987). Effects of quality and quantity of information on decision effectiveness. Journal of Consumer Research, 14(2), 200–213. Klein, N. M., & Yadav, M. S. (1989). Context effects on effort and accuracy in choice: An enquiry into adaptive decision making. Journal of Consumer Research, 16(4), 411–421. Koch, S. (1981). Online computer auditing through continuous and intermittent simulation; Harvey. MIS Quarterly, 5(1), 29–41. Kogan, A., Sudit, E. F., & Vasarhelyi, M. A. (1999). Continuous online auditing: A program of research. Journal of Information Systems, 13(2), 87–103. Malhotra, N. (1982). Information load and consumer decision making. Journal of Consumer Research, 8(4), 419–431. McDonough, A. M. (1963). Information economics and management systems. New York, NY: McGraw-Hill Book Company, Inc. Meyer, R. J. (1982). A descriptive model of consumer information search behavior. Marketing Science, 1(1), 93–121. Murthy, U. S., & Groomer, S. M. (2004). A continuous auditing web services model for XMLbased accounting systems. International Journal of Accounting Information Systems, 5(2), 139–164. Nelson, P. (1970). Information and consumer behavior. Journal of Political Economy, 78(2), 311–329. Noteberg, A., Benford, T. L., & Hunton, J. E. (2003). Matching electronic communication media and audit tasks. International Journal of Accounting Information Systems, 4(1), 27–56. Payne, J. W., Bettman, J. R., & Johnson, E. J. (1988). Adaptive strategy selection in decision making. Journal of Experimental Psychology: Learning, Memory and Cognition, 14(3), 534–552. Rothstein, H. G. (1986). The effects of time pressure on judgment in multiple cue probability learning. Organizational Behavior and Human Decision Performance, 37, 83–92. Saad, G. (1998). Information reacquisition in sequential consumer choice. Advances in Consumer Research, 25(1), 233–239. Saad, G., & Russo, J. E. (1996). Stopping criteria in sequential choice. Organizational Behavior and Human Decision Processes, 67(3), 258–270. Searcy, D., Woodroof, J., & Behn, B. (2003). Continuous audit: The motivation, benefits, problems, and challenges identified by partners of a big 4 accounting firm. 36th Annual Hawaii International Conference on System Sciences, Big Island, Hawaii.
28
RONALD J. DAIGLE AND JAMES C. LAMPE
Selart, M. (1996). Structure compatibility and restructuring in judgment and choice. Organizational Behavior and Human Decision Processes, 65(2), 106–116. Simonson, I., Huber, J., & Payne, J. (1988). The relationship between knowledge and information acquisition order. Journal of Consumer Research, 14(4), 566–578. Simonson, I., & Tversky, A. (1992). Choice in context: Tradeoff contrast and extremeness aversion. Journal of Marketing Research, 29(3), 281–295. Sivakumar, K., & Cherian, J. (1995). Role of product entry and exit on the attraction effects. Marketing Letters, 29(1), 45–51. Smith, C. A. P., Arnold, V., & Sutton, S. G. (1997). The impact of time pressure on decisionmaking for choice and judgment tasks. Accounting and Business Review, 4(2), 365–383. Stigler, G. J. (1961). The economics of information. Journal of Political Economy, 69(3), 213–225. Trotman, K. T., & Wright, A. (1996). Recency effects: Task complexity, decision mode, and task-specific experience. Behavioral Research in Accounting, 8, 175–193. Vasarhelyi, M. A. (1998). Toward an intelligent audit: Online reporting, online audit, and other assurance services. Advances in Accounting Information Systems, 6, 207–221. Vasarhelyi, M. A. (2002). Concepts in continuous assurance. In: V. Arnold & S. G. Sutton (Eds), Researching accounting as an information systems discipline (pp. 257–271). Sarasota, FL: American Accounting Association. Vasarhelyi, M. A., Alles, M. G., & Kogan, A. (2004). Principles of analytic monitoring for continuous assurance. Journal of Emerging Technologies in Accounting, 1, 1–21. Vasarhelyi, M. A., & Halper, F. B. (1991). The continuous audit of online systems. Auditing: A Journal of Practice and Theory, 10(1), 110–125. Vasarhelyi, M. A., Halper, F. B., & Ezawa, K. J. (1991). The continuous audit of online systems. The EDP Auditor Journal, 3, 85–91. Wallace, W. (1980). The economic role of the audit in free and regulated markets. New York, NY: Touche Ross Foundation. Wilcox, R. T. (2003). Bargain hunting or star gazing? Investors’ preferences for stock mutual funds. Journal of Business, 76(4), 645–663. Wilkinson, A., Elahi, S., & Eidinow, E. (2003). Background and dynamics of the scenarios. Journal of Risk Research, 6(4–6), 365–401. Woodroof, J., & Searcy, D. (2001). Continuous audit model development and implementation within a debt covenant compliance domain. International Journal of Accounting Information Systems, 2(3), 169–191. Yates, J. F. (1990). Judgment and decision making. Englewood Cliffs, NJ: Prentice Hall.
USING ELECTRONIC AUDIT WORKPAPER SYSTEMS IN AUDIT PRACTICE: TASK ANALYSIS, LEARNING, AND RESISTANCE Jean C. Bedard, Michael L. Ettredge and Karla M. Johnstone ABSTRACT Audit firms have adopted electronic workpaper systems in hope of improving efficiency and effectiveness, but prior research shows that expected gains are difficult to achieve. Using survey data from an international audit firm, this paper identifies individual task components involved in workpaper preparation and review, assesses the relative difficulty of performing each task, and examines the ‘‘learning curve’’ by relating difficulty to performance frequency. Tasks involving ‘‘navigation’’ around the system are found to be most difficult. Audit managers and partners rate the system as more difficult than other auditors, and report using fewer of its capabilities. There is some evidence of ‘‘working around’’ the system, including creating and storing information outside the system. The study’s results should be useful to audit firms in targeting training efforts, and have implications relating to compliance with Auditing Standard 3 on audit documentation.
Advances in Accounting Behavioral Research, Volume 10, 29–53 Copyright r 2007 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1016/S1475-1488(07)10002-8
29
30
JEAN C. BEDARD ET AL.
INTRODUCTION This study investigates specific sources of difficulty faced by auditors in preparing and reviewing workpapers using electronic systems. The analysis is motivated by the ongoing shift at some audit firms from paper to fully electronic environments for audit work systems (Yang, 1993; Rothman, 1997; McCollum & Salierno, 2003). Other audit firms have adopted partially electronic systems (e.g., simply creating .pdf files for storage) or are still considering making this transition. Further, the use of electronic audit workpaper systems varies in practice, implying some resistance to full incorporation in everyday use. For example, some firms require all personnel to use the system, while others require only staff and senior auditors to use the system. Some firms allow individual partners to ‘‘opt out’’ of using the system on specific engagements, while others do not. Some firms require that all tasks are completed using the system, while others require only certain tasks to be completed using the system. Some firms require use of the system for electronic storage of documentation, while others allow storage of documentation in both electronic and paper formats. Further, electronic systems are not yet fully tailored to some industryspecific needs (e.g., governmental and nonprofit), so in these industries paper-based systems are more common. These variations in practice imply that the move toward electronic workpapers in the auditing industry is an ongoing, dynamic process.1 While the advent of electronic audit workpaper systems is an important change in audit practice, there are only a few studies investigating use of such systems. These studies reveal some potential effectiveness and efficiency difficulties (Bedard, Ettredge, Jackson, & Johnstone, 2003; Brazel, Agoglia, & Hatfield, 2004; Bible, Graham, & Rosman, 2005; Rosman, Bible, Biggs, & Graham, 2007). Prior research in other contexts is mixed, with some studies showing that knowledge-based systems using hypertext increase effectiveness and efficiency, and encourage learning (e.g., Spiro, Coulson, Feltovich, & Anderson, 1988). However, other studies show that information processing with hypertext is cognitively demanding because it uses some short-term memory that would otherwise be devoted to processing of task information (e.g., Thuring, Hannemann, & Haake, 1995). Further, hypertext environments promote nonlinear (nonsequential) processing, which may complicate task performance (e.g., Mills, Paper, Lawless, & Kulikowich, 2002). Due to the importance of audit workpaper integrity in practice, and mixed findings of prior research, further study of electronic workpapers in the auditing context is clearly warranted.
Using Electronic Audit Workpaper Systems in Audit Practice
31
This study focuses on the sources of difficulty in using electronic workpaper systems in practice, and the extent to which these difficulties are ameliorated by system experience. Data for this study were obtained from an international auditing firm that had recently introduced an electronic audit workpaper system in its U.S. practice.2 Study participants rated the difficulty of individual component tasks involved in preparing and reviewing audit workpapers electronically, and the frequency with which they accomplish those tasks. The taxonomy contains 38 component tasks performed by workpaper preparers (staff and seniors), and 28 component tasks performed by workpaper reviewers (managers and partners), classified into the categories of system security, data input, organization of the file and verification of data, and review. Results show that reviewers find the electronic system more difficult to use than preparers. While the mean difficulty ratings of component tasks are generally low, some tasks are rated more difficult than others. Particularly, higher difficulty ratings are observed in tasks involving ‘‘navigating’’ around the electronic system, consistent with the findings of Bible et al. (2005). These tasks include ensuring that workpapers are updated for adjusting journal entries, and tracing amounts from financial statements to lead sheets, among others. Regarding the overall learning curve associated with the new system, results show that it takes about five engagements on average before personnel are comfortable using the electronic system, although reviewers report a higher mean and greater range on this measure. Auditors at all levels report a significant increase in using the full capabilities of the system over time. However, there is evidence of variance in full use of the system, based on responses to questions about behaviors suggestive of ‘‘working around’’ the system (e.g., creating review notes on paper outside of the electronic system).3 Both preparers and reviewers report reduced incidence of working around the system as they gained familiarity with it, but there remain some reports of these behaviors even after system familiarity is achieved. In sum, the study’s results show improvement in system use with practice. However, there are specific pockets of difficulty that persist even after considerable system use. These results are useful to audit practice in developing new electronic audit workpaper systems, revising existing systems, or focusing training on existing users. Further, leaders of audit firms may find this study useful as they consider implications of variation in documentation (and related regulatory risk) associated with maintaining audit workpapers under Auditing Standard No. 3 (PCAOB, 2004). The study’s findings will also
32
JEAN C. BEDARD ET AL.
assist researchers in designing studies about specific features of the audit workpaper preparation and review process.4
USING ELECTRONIC AUDIT WORKPAPER SYSTEMS Entities of all types are incorporating electronic technologies with the objective of improving effectiveness and efficiency of business processes. In order for technologies to achieve these goals, users must both initially accept them and persist in using them in ways consistent with the goals of the organization. There is an extensive body of research addressing user intentions regarding technology use and technology adoption, based on such theories as the technology acceptance model (e.g., Davis, Bagozzi, & Warshaw, 1989) and the theory of planned behavior (Azjen, 1991).5 As noted by Kim and Malhotra (2005), researchers have recently focused significant attention on issues regarding post-adoption system usage. While studies have shown that users’ system perceptions continue to influence their intentions to use systems through time (e.g., Venkatesh, Morris, & Ackerman, 2000), Kim and Malhotra (2005) propose and find that post-adoption usage at any point in time (among students using a web-based information system) is highly influenced by prior system use. This research justifies study of post-adoption usage, as it implies that individuals continually update their system perceptions as patterns of use become habituated. Within this literature, some studies consider user resistance to system implementation. Research specifically targeted at work-around behaviors in complex professional tasks shows that the goal of the ‘‘paperless’’ office is often not completely achieved, and that employees often attempt to circumvent electronic systems by reverting to paper processing. Studies in contexts other than auditing show that users bypass newly implemented work systems by reverting to the former system for certain tasks (Chau, 1996), by duplicating tasks in both old and new systems (Sellen & Harper, 2002), and/or by not using the new system correctly (Markus, 1983; Hartwick & Barki, 1994). In addition, even when individuals have a strong motivation to appropriately use an electronic system, their success may be limited because of the complexity of the task and associated disorientation within the electronic system (e.g., Nielsen, 1990), or because their task knowledge is not sufficiently well developed to successfully leverage system features (Mills et al., 2002). The system that is the subject of this study uses hypertext to link specific workpaper items across the file. Thus, the specific literature on hypertext is
Using Electronic Audit Workpaper Systems in Audit Practice
33
also relevant to the research questions. Prior literature (often in the context of internet search) shows that hypertext is a valuable tool because it preserves associations among domain concepts (Jonassen, 1990), encourages exploration (Stanton & Stammers, 1990), and enables knowledge acquisition in complex task settings (Spiro et al., 1988). While these studies show the value of hypertext, other research shows that hypertext use can be affected by individual users’ characteristics.6 Thus, the ability to use hypertext effectively likely varies across individuals, resulting in different levels of adoption and speeds of integration into the professional’s work routine. In sum, the research cited above suggests that while organizations wish to improve the flow and outputs of work by systems implementation, pockets of resistance to those systems might exist in the post-adoption environment. In the case of fully integrated auditing workpaper systems, the potential consequences of system resistance are serious. The audit workpaper is a legal document containing evidence supporting the audit opinion. The completed workpaper compiles evidence that, for audits of large companies, is accumulated over a period of time by many individual professionals acting in a hierarchy. Each firm has a defined, complex set of procedures that must be performed in a certain order, aggregated, and reviewed for completeness. Bypassing the system by working off-line can affect efficiency, effectiveness, or both. Inefficiency could result if tasks are duplicated, while ineffectiveness could result if key workpapers are lost or the file is not constructed correctly, so that it cannot be easily reviewed. Further, if preparers of the engagement file print out workpapers or lead sheets during the engagement, they will not be using the system linkage and crossreferencing capabilities. Thus, subsequent reviewers of the file will be unable to perform an efficient review. Creating review notes on paper at any point in the team hierarchy will also result in subsequent reviewers being unable to access them from remote locations. Thus, there are potentially important consequences to working around the system by resisting electronic functionalities and reverting to paper processing. In addition to consequences for the engagement itself, the integrity of the workpaper is crucial as a foundation for later inspection by the Public Company Accounting Oversight Board (PCAOB) (PCAOB, 2004). Despite the key role of electronic workpaper systems, there is little research on the audit effectiveness and efficiency implications of these systems. Bible et al. (2005) find audit effectiveness decrements resulting from difficulties in navigating around an electronic audit workpaper system. Extending those findings, Rosman et al. (2007) show that auditors’ difficulties in electronic environments are associated with system
34
JEAN C. BEDARD ET AL.
complexity, and that the most successful auditors in the electronic environment adapt to it by limiting the extent of their navigation around the system and instead focusing on understanding and remembering the information gained from the system. Further, Brazel et al. (2004) show that, compared to those anticipating face-to-face review, auditors anticipating electronic review are less concerned about audit effectiveness, more likely to be influenced by prior workpapers, and feel less accountable for their work. Further, Glover, Prawitt, and Romney (2000) find that many internal auditors report using internally developed software in performing their professional roles, but their satisfaction with these tools varies widely. While these studies motivate further research on effectiveness and efficiency in electronic versus paper environments, the issue arises as to whether such effects would be limited to new system applications, or whether they would persist following training and/or practice. On the issue of training effects, Bedard et al. (2003) find that face-to-face training prior to implementation improves auditors’ perceptions of system quality and intentions toward using a new electronic workpaper system. However, they also find that auditors’ perceptions of their own ability to perform audit tasks using the system does not necessarily improve with training. Thus, the little evidence on improvements due to training in this context is mixed. In order to identify sources of difficulty in accomplishing work within an electronic audit workpaper system, it is important to understand the basic nature of the tasks that auditors complete. However, published academic or practitioner articles do not describe the exact nature of the tasks that auditors with different work roles accomplish using electronic audit workpaper systems. Gaining this understanding is important from a practical standpoint for guiding implementation and training, but it is also important because performing good research in auditing requires detailed understanding of the component tasks involved (Abdolmohammadi & Usoff, 2001; Trotman, 2005). The first set of research questions considers issues involving the identification of component tasks, the assessment of relative task difficulty, and comparisons of difficulty across paper and electronic environments. RQ1. What are the component audit tasks involved in an electronic audit workpaper system, for auditors in preparer and reviewer roles? RQ2. What is the relative difficulty of performing component workpaper tasks?
Using Electronic Audit Workpaper Systems in Audit Practice
35
RQ3. What is the relative frequency of performing component workpaper tasks? RQ4. Which component tasks are more difficult in an electronic system compared to a traditional paper audit workpaper system? With the component task difficulty and frequency ratings as background, this study next investigates the transition to using the electronic work system, and its post-adoption use. As previously noted, research in other business contexts finds that the full benefit of paperless office systems might not be achieved because employees work around systems. In the current context, there is some evidence of effectiveness problems associated with electronic systems (Bible et al., 2005). However, repeated performance of workpaper tasks within the context of an auditor’s normal practice may resolve these problems over time. Based on prior research showing that system usage patterns emerge through experience (e.g., Kim & Malhotra, 2005), H1 proposes that users’ difficulty perceptions are inversely associated with system experience, and H2 proposes that work-around behaviors also decline as experience with the system is gained. H1. User perceptions of system difficulty decrease with system experience. H2. The frequency of user resistance behaviors decreases with system experience.
METHOD Description of the Electronic Audit Workpaper System The electronic audit workpaper system of the participating audit firm encompasses all phases of the audit process. Auditors begin the process of engagement file construction by gaining access to the system, which is password protected and has file-sharing features that enable remote users to simultaneously access and change the file. Once access is gained to the system, workpaper preparers work with a master file containing generalized procedures that enables the auditors to conduct an effective audit that appropriately controls risks. Workpaper preparers can tailor the file to address specific client risks, including setting the strategy to be used on the engagement and altering the nature, timing, and/or extent of planned audit
36
JEAN C. BEDARD ET AL.
procedures. Workpaper reviewers can electronically access the file, making changes and electronically inserting review comments. Once planning is accomplished, auditors use the client-tailored engagement file to perform the engagement. The file contains standard workpaper templates for auditing routine account balances, and auditors use these to update information from prior years. Auditors insert electronic memos into the file in order to document discussions with the client or create short notes about the results of a test or procedure. Copies of related files can be embedded within the master file or an electronic link can be made between files. The system has various functionalities that assist auditors, including electronic tickmarks, and the generation of a workpaper reference list that documents all tasks accomplished and reviewed, and all tasks for which work still needs to be accomplished. The system uses a cascading windows-type feature that enables auditors to view and copy portions of various files on the computer screen at the same time, and there is an electronic scratch pad for making quick mathematical calculations. The audit firm’s decision aids are linked into the system, including the audit sampling tool. The system automatically records which system user accomplished each audit task, and the time that each task was accomplished. Auditors save electronic copies of the file at least twice daily, and the system saves changes to the file and stores them at a secure, remote site daily. The completed engagement file is also electronically archived at a secure, remote location. Finally, the system contains a roll-forward feature that makes it possible to create a new engagement file while maintaining the tailoring performed in the prior year.
Research Procedures and Sample System developers and other personnel at the participating international audit firm assisted in the development of a survey instrument to assess auditors’ perceptions of the relative difficulty of the component tasks of preparing and reviewing electronic workpapers, and the relative frequency of use. The instrument also captures self-reports of behaviors inconsistent with the goal of electronic processing and storage of audit information (i.e., resisting the system by ‘‘working around’’ it). The instrument was distributed by contact personnel at 12 offices of the firm. Complete responses were obtained from 119 professionals in those offices, a response rate of about 70%. Of the respondents, 24 are audit staff, 45 are seniors,
Using Electronic Audit Workpaper Systems in Audit Practice
37
27 are managers, and 23 are partners.7 Respondents had experience using the electronic system for one or two years, for multiple engagements in each year. The mean number of engagements using the system is 23 for preparers and 36 for reviewers.
Variable Measurement and Testing The survey instrument assesses the relative frequency and difficulty of using all component tasks of an audit using an electronic workpaper system. To address RQ1 (identifying task components), system developers assisted in providing precise steps involved in constructing and reviewing workpapers on this system. These tasks are shown in Table 1. While the existing literature provides examples of various audit tasks (e.g., Abdolmohammadi, 1999; Rich, Solomon, & Trotman, 1997), these general taxonomies do not have a level of detail fine enough to be of use in understanding the specific steps of workpaper preparation and review. Thus, developers of this system are the individuals most able to enumerate the steps involved in preparing and reviewing audit workpapers on the system, consistent with generally accepted auditing standards. For purposes of analysis, the component tasks are categorized according to major phases of the preparation or review process. To address RQ2 and RQ3, participants assessed relative difficulty and frequency, respectively, for each component task relevant to their workpaper role. For some component tasks, there is an equivalent audit task in a paper-based system (e.g., creating review notes); and, for those tasks, participants also made frequency and difficulty assessments relative to the paper environment.8 Task difficulty ratings between electronic and paper environments are compared to address RQ4. H1 is tested using the correlation between component task difficulty and task frequency ratings. In addition, the effects of experience are assessed on a system-wide level by comparing responses to questions regarding the extent to which auditors felt they were using the system’s full capabilities, on the first few engagements and after gaining familiarity with the system. To test H2, the researchers worked with system developers to identify behaviors consistent with ‘‘working around’’ the system. Participating auditors are asked to indicate the extent that they engaged in those behaviors on the first few engagements using the new system and after gaining familiarity with the system, in order to measure whether such behaviors decline with system experience.
38
JEAN C. BEDARD ET AL.
Table 1. Relative Difficulty and Frequency of Using Audit Tasks in an Electronic Workpaper System. Difficulty Mean (s.d.) Panel A. Preparers Security tasks Manage the security of the file Check in files Backup files to external media (to a CD) Distribute checked out files Create checked out files Ensure the appropriate use of signatures by team member who did the work Backup files to the network Sign into electronic audit system file Average of security tasks Input tasks Annotate and complete scanned workpapers electronically Create scanned documents Import client and engagement information from separate databases Insert scanned documents Refresh workpapers inserted in electronic audit system file after AJEs are booked Use tickmarks Annotate and complete Excel and Word workpapers electronically Insert financial statement workpapers Create Excel and Word workpapers Insert Excel and Word workpapers Create memos Average of input tasks Organization/verification tasks Ensure that workpapers are updated for AJEs Find workpapers and memos Agree lead sheets to workpapers Trace an amount from the financial statements to the lead sheets and workpapers Cross-reference workpapers to supporting documents, lead sheets and financial statements Cross-reference workpapers to lead sheets and supporting documents
1.9 1.8 1.8 1.8 1.7 1.7
Percent rating as difficult
Frequency Mean (s.d.)
(1.0) (1.0) (1.0) (0.9) (0.9) (0.8)
4.1 9.1 7.6 6.1 4.5 4.6
3.3 4.3 3.6 4.2 4.3 4.2
(1.5) (1.1) (1.4) (1.2) (1.1) (1.0)
1.5 (0.7) 1.0 (0.2) 1.6 (0.5)
1.5 0
3.7 (1.3) 4.9 (0.2) 4.1 (0.7)
3.0 (1.5)
39.7
2.5 (1.5)
2.6 (1.5) 2.2 (1.2)
28.1 18.8
2.4 (1.5) 1.6 (1.2)
2.0 (1.3) 2.0 (1.3)
13.6 16.7
2.8 (1.5) 4.7 (0.6)
1.7 (1.0) 1.6 (0.9)
7.3 2.9
4.5 (0.9) 4.7 (0.6)
1.4 1.4 1.3 1.1 1.9
(0.7) (0.8) (0.6) (0.4) (0.6)
3.0 1.5 1.5 1.5
4.6 3.9 4.9 4.9 3.7
(0.7) (1.6) (0.3) (0.4) (0.5)
2.5 2.4 2.1 2.1
(1.2) (1.1) (1.1) (1.0)
22.7 13.2 14.7 8.8
4.4 4.6 4.6 4.4
(0.8) (0.8) (0.7) (1.0)
2.1 (1.0)
8.8
4.4 (1.0)
2.0 (1.0)
10.3
4.3 (1.0)
Using Electronic Audit Workpaper Systems in Audit Practice
39
Table 1. (Continued ) Difficulty
Agree workpapers to lead sheets and supporting documents Determine which review notes have not been cleared Determine which procedures have not been completed Determine which review notes have not been closed Determine which workpapers/memos have not been reviewed Determine which workpapers/memos have not been signed off Sign off workpapers and memos All organization/verification tasks Review tasks Review Excel workpapers on screen Review Word workpapers on screen Close review notes Create review notes Review memos on screen Respond to review notes All review tasks Panel B. Reviewers Security tasks Manage the security of the file Ensure the appropriate use of signatures by the engagement team member who did the work Sign into electronic audit system file Average of security tasks Organization/verification tasks Ensure that workpapers are updated for AJEs Find workpapers/memos Sign off workpapers/memos as reviewed Determine which workpapers/memos have not been reviewed Determine which review notes have not been closed Average of organization/verification tasks
Frequency
Mean (s.d.)
Percent rating as difficult
Mean (s.d.)
2.0 (1.0)
5.9
4.6 (0.6)
1.7 (0.8) 1.7 (0.8)
1.5 1.5
4.1 (1.2) 4.3 (0.9)
1.6 (0.8) 1.6 (0.8)
1.5 2.9
4.0 (1.2) 4.2 (1.0)
1.6 (0.8)
2.9
4.3 (1.0)
1.2 (0.7) 1.9 (0.7)
2.9
4.9 (0.2) 4.4 (0.6)
2.1 1.7 1.4 1.4 1.3 1.3 1.6
(1.1) (0.9) (0.7) (0.6) (0.7) (0.6) (0.6)
14.9 7.5 1.5 0 1.5 0
4.4 4.4 4.0 4.1 4.4 4.4 4.3
2.3 (1.0) 1.9 (1.0)
10.2 6.3
2.9 (1.3) 4.0 (1.2)
1.2 (0.5) 1.8 (0.6)
2.0
4.7 (0.6) 3.9 (0.7)
3.3 2.9 1.6 1.6
(1.2) (1.1) (1.0) (0.8)
46.9 32.7 10.0 4.1
1.5 (0.8) 2.2 (0.7)
2.0
4.3 (1.1) 4.4 (0.6)
44.0
3.9 (1.1)
Review tasks Trace an amount from the financial statements to the 3.3 (1.1) lead sheets and workpapers
3.8 4.5 4.8 4.6
(1.1) (1.1) (1.3) (1.2) (1.0) (1.0) (0.9)
(1.4) (0.7) (0.5) (0.6)
40
JEAN C. BEDARD ET AL.
Table 1. (Continued ) Difficulty
Determine that workpapers or memos have been prepared for all significant balances appearing on the lead sheets Ensure that the workpapers have been properly crossreferenced to supporting documents, lead sheets, and financial statements Ensure that workpapers agree to lead sheets Determine which workpapers are key Review Excel workpapers on screen Review scanned documents on screen During fieldwork, determine what changes were made to audit procedures approved at the planning stage Trace an amount from a workpaper to a supporting document Indicate that you as reviewer have agreed an amount to a lead sheet or supporting document Insert the reviewer’s tickmarks into a workpaper At the planning stage, determine what tailoring changes were made to the audit procedures Send a file that you have worked on to another engagement team member Review Word workpapers on screen Ensure that all review notes have been properly cleared Create review notes Indicate that an engagement team member’s response to a review note is not adequate Delete review notes from the engagement file at the end of the engagement Review memos on screen Close review notes Average of review tasks
Frequency
Mean (s.d.)
Percent rating as difficult
Mean (s.d.)
3.2 (1.3)
48.0
4.3 (0.9)
3.0 (1.0)
30.6
3.6 (1.2)
2.9 2.8 2.7 2.6 2.5
(1.2) (1.2) (1.1) (1.3) (1.0)
32.7 36.7 24.5 27.1 18.4
4.0 (1.1) 3.5(1.2) 4.8 (0.5) 3.1 (1.6) 2.6 (1.0)
2.5 (1.0)
14.0
3.9 (1.3)
2.4 (1.3)
18.4
3.3 (1.6)
2.2 (1.3) 2.1 (1.0)
18.4 12.0
3.1 (1.7) 4.1 (1.1)
1.9 (1.1)
10.2
4.3 (1.2)
1.7 (0.8) 1.7 (0.9)
2.0 4.1
4.8 (0.5) 4.4 (1.0)
1.6 (0.9) 1.4 (0.8)
8.2 4.2
4.4 (1.3) 3.3 (1.4)
1.3 (0.6)
0
4.1 (1.3)
1.3 (0.6) 1.3 (0.7) 2.2 (0.6)
0 2.1
4.9 (0.4) 4.4 (1.2) 3.9 (0.7)
Note: Task difficulty is measured on a scale of 1 (very easy) to 5 (very difficult). Task frequency is measured on a scale of 1 (very rarely) to 5 (very often). In addition to means and standard deviations, the table reports the percent of auditors indicating the task is relatively difficult (i.e., a difficulty rating of 4 or 5).
Using Electronic Audit Workpaper Systems in Audit Practice
41
RESULTS Defining the Component Tasks and Assessing Task Difficulty Table 1 reports results relating to our first four research questions.9 Regarding RQ1, the major tasks that auditors complete in electronic audit workpaper systems include security, data input, organization/verification, and review (see Table 1 for specific tasks within each of these categories). RQ2 concerns the difficulty of various component workpaper tasks. Panel A shows that mean difficulty ratings are fairly low, indicating that at this stage in use most participants find it relatively easy to use. For preparers, mean difficulty ratings are 1.6 on the five-point scale for security and review tasks (indicating between ‘‘easy’’ and ‘‘very easy’’), and 1.9 for input and organization/verification tasks. While the means reveal little cause for concern, there is considerable variation in difficulty within the task categories.10 For instance, almost 40% of preparers indicate difficulty in annotating and completing scanned workpapers electronically (i.e., a response of ‘‘Difficult’’ or ‘‘Very Difficult’’). Other component tasks with high percentages of difficulty ratings within the data input category include creating scanned documents (28.1%), importing client information from external databases (18.8%), and refreshing workpapers after AJE’s are booked (16.7%). The most difficult component tasks in organization/ verification include ensuring that workpapers are updated for adjusting journal entries (22.7%) and agreeing lead sheets to workpapers (14.7%). All of these tasks are crucial to constructing and maintaining accurate and complete workpapers. Panel B of Table 1 shows that following one to two years of system use, reviewers’ difficulty ratings are somewhat higher, but means are still below the ‘‘neutral’’ level. Specifically, reviewers rate organization/verification and review tasks (mean=2.2) as more difficult than security tasks (mean=1.8). The most difficult organization/verification tasks for reviewers include ensuring that workpapers are updated for AJEs (46.9%) and finding workpapers/memos (32.7%). Some of the review tasks causing the most difficulty include determining that workpapers or memos have been prepared for all significant balances appearing on the lead sheets (48%), tracing amounts from the financial statements to the lead sheets and workpapers (44%), determining which workpapers are key (36.7%), ensuring that the workpapers agree to lead sheets (32.7%), and ensuring that workpapers have been properly cross-referenced to supporting documents (30.6%).11
42
JEAN C. BEDARD ET AL.
The evidence in Table 1 yields several insights. First, while many tasks are relatively easy, the proportion of auditors indicating difficulty with some component tasks is fairly high, even after fairly extensive electronic experience. Second, there is considerable variance in difficulty ratings of component tasks within the input and organization/verification task categories for preparers, and within the organization/verification and review categories for reviewers. This suggests that it is not the activity that is being performed, but some aspect of performing it electronically, that is causing the problem. Third, reviewers’ responses indicate greater difficulty on most dimensions than preparers’ responses. Thus, although reviewers perform their tasks on more engagements than preparers (a mean number of electronic engagements of 36 versus 23 for preparers), they continue to have difficulty with some aspects of the task, perhaps leading to inefficiency.12 RQ3 concerns the relative frequency with which component tasks are performed. Table 1 Panel A shows that more frequent tasks for preparers include signing onto the electronic workpaper system, inserting Excel and Word workpapers, and creating memos. Panel B shows that more frequent tasks for reviewers include signing off workpapers and memos as reviewed, reviewing Excel and Word workpapers on screen, and reviewing memos on screen. Table 2 reports results of RQ4, concerning which tasks are more difficult in an electronic versus a traditional paper environment. For preparers, the tasks whose difficulty increases most in the shift to an electronic environment include agreeing lead sheets to workpapers, agreeing workpapers to lead sheets and supporting documents, tracing amounts from the financial statements to the lead sheets and workpapers, and finding workpapers and memos. For reviewers, tasks whose difficulty increases the most upon shifting to an electronic environment include tracing an amount from the financial statements to the lead sheets and workpapers, determining that workpapers/memos have been prepared for all significant balances appearing on the lead sheets, ensuring that workpapers agree to lead sheets and are updated for adjusting entries, and ensuring workpapers are properly cross-referenced to supporting documentation. Taken together, these results imply that the electronic environment seems to present important difficulties to both preparers and reviewers in navigating around the electronic file. It is also interesting to note that reviewers’ mean difference in difficulty between paper and electronic environments is much higher than that of preparers’, providing further evidence that the shift toward an electronic environment is more challenging for reviewers.
Using Electronic Audit Workpaper Systems in Audit Practice
Table 2.
43
Description of Tasks that Are More Difficult in an Electronic Environment.
Task Description
Panel A. Preparers Agree lead sheets to workpapers Agree workpapers to lead sheets and supporting documents Trace an amount from the financial statements to the lead sheets and workpapers Find workpapers and memos Panel B. Reviewers Trace an amount from the financial statements to the lead sheets and workpapers Determine that workpapers or memos have been prepared for all significant balances appearing on the lead sheets Ensure that workpapers agree to lead sheets Ensure that workpapers are updated for AJEs Ensure that the workpapers have been properly cross-referenced to supporting documents, lead sheets, and financial statements Trace an amount from a workpaper to a supporting document Find workpapers/memos Indicate that you as reviewer have agreed an amount to a lead sheet or supporting document
Mean Mean Difficulty in Difficulty in Paper Electronic Environment Environment (percent=4 or (percent=4 or 5 rating) 5 rating)
Mean Difference
MatchedPairs t
2.1 (15%) 2.0 (10%)
1.3 (0%) 1.3 (0%)
0.8 0.7
4.955*** 4.925***
2.1 (9%)
1.5 (2%)
0.6
3.782***
2.4 (13%)
1.9 (5%)
0.4
4.955***
3.3 (44%)
1.5 (5%)
1.8
8.356***
3.2 (48%)
1.6 (7%)
1.6
5.974***
2.9 (33%)
1.5 (2%)
1.4
6.898***
3.3 (47%)
2.1 (10%)
1.1
5.436***
3.0 (31%)
1.9 (7%)
1.1
5.665***
2.5 (14%)
1.6 (2%)
0.9
5.050***
3.0 (33%) 2.4 (18%)
2.1 (12%) 1.6 (5%)
0.8 0.8
3.579*** 3.106***
Note: p0.01.
Transition and Learning Issues: Tests of Hypotheses The second set of findings addresses transition and learning issues. Table 3 reports results of testing H1, which predicts that auditors’ perceptions of
44
JEAN C. BEDARD ET AL.
Table 3.
Factors Affecting Auditors’ Learning on New Electronic Audit Workpaper Systems.
Panel A. Pearson correlation between difficulty and frequency, by task category
Overall Top ten most difficult tasks Security tasks Input tasks Organization/verification tasks Review tasks
Preparers
Reviewers
0.376 0.286 0.422 0.668 0.144 0.184
0.065 0.028 0.248 n/a 0.027 0.129
Panel B. The learning curve: self-reports on using the full capabilities of the electronic system Mean (First Few Mean (After Gaining Engagements) Familiarity) Preparers Reviewers t-Test between ranks:
2.57 2.40 1.009
3.70 3.38 2.161
Mean Difference
t-Test
1.13 0.98
11.973 9.705
Note: Panel B data represent auditors’ responses to the question: ‘‘How often do you believe you were using the full capabilities of the system (on the first few engagements, and after you got used to using the system)?’’ The response scale is: 1=Never, 2=Infrequently, 3=Sometimes, 4=Frequently, 5=Always. pr0.10. pr0.05. pr0.01.
task difficulty will decrease with system experience. This hypothesis is tested using the correlation between the task performance frequency and perceived task difficulty scores from Table 1. For workpaper preparers, the overall correlation between difficulty and frequency is significantly negative ( 0.376; po0.001), as is the correlation between difficulty and frequency on the ten most difficult tasks ( 0.286, po0.001). This implies that performing tasks more frequently reduces perceived difficulty for these auditors. In contrast, most of the correlations for reviewers are not significant, implying that ‘‘learning by doing’’ is not effective in reducing task difficulty for managers and partners. This suggests that managers and partners require more intensive training and assistance in order to become comfortable using the system. Organization/verification is the only task category for preparers in which there is not a significant negative correlation, suggesting no learning curve effect for this category. For
Using Electronic Audit Workpaper Systems in Audit Practice
45
reviewers, the only task category in which there is a significant negative correlation is file security ( 0.248; po0.05). These findings reinforce the organization/verification tasks as a key source of difficulty in preparing and reviewing electronic workpapers, as practice does not yield improvement in that category for either group. Table 3 Panel B shows results of testing H1 at a system-wide level; i.e., by asking auditors to judge the extent to which they were using the full capabilities of the electronic system, both on their first few engagements and after they became familiar with it. Means for preparers and reviewers are 2.57 and 2.40, respectively, on the first few engagements (indicating between ‘‘infrequently’’ and ‘‘sometimes’’ using the system’s full capabilities). After gaining familiarity, preparers’ and reviewers’ reports average 3.70 and 3.38, respectively (between ‘‘sometimes’’ and ‘‘frequently’’). Thus, both workpaper preparers and reviewers report significant improvement in using the system’s full capabilities once they get used to the system, indicating that practice does improve system acceptance at a global level for both groups. System learning can also be measured by considering the number of engagements needed to feel comfortable using the electronic system. Preparers’ responses to this question (not tabled) indicate a mean of 4.6 engagements (with a range of 1–12) to become comfortable with the system, while reviewers’ mean is 6.3 engagements (with a range of 1–40). The greater mean and range for reviewers is consistent with previously reported findings that reviewers have greater difficulty adapting to the system.
Resistance Issues Table 4 presents results of testing H2, which predicts that auditor resistance to the system will decrease with experience. System resistance is measured through specific questions asking for the frequency with which certain ‘‘work-around’’ behaviors are performed. Results in Panel A show that on average, preparers report moderate levels of working around the system on the first few engagements, with means representing incidence of these behaviors between ‘‘infrequently’’ and ‘‘sometimes.’’ The most common behavior in working around the system on the first few engagements is storing or maintaining workpapers on paper instead of in electronic form (mean=3.01). The most common behavior after getting used to the system is printing out lead sheets and cross-referencing them to the financial statements and audit workpapers rather than doing so electronically
46
JEAN C. BEDARD ET AL.
Table 4.
Frequency of Work-Around Behaviors and ExperienceRelated Differences.
Panel A. Preparers How often did you print out workpapers so that they can be annotated and completed? How often did you store or maintain workpapers in a paper binder? How often did you print out lead sheets and cross-referenced them to financial statements and audit workpapers? How often did you create review notes on a piece of paper as opposed to within the electronic system? How often did you suggest to manager or partner that the electronic engagement should be terminated so that it can be completed using a paper binder? Panel B. Reviewers How often did you print out workpapers so that they can be reviewed? How often did you create review notes on a piece of paper as opposed to within the electronic system? How often did you terminate the electronic engagement and require that it be completed using a paper binder?
Mean on First Few Engagements
Mean After Getting Used to the System
Mean Difference
MatchedPairs t-Test
2.75
1.97
0.78
10.079***
3.01
2.28
0.74
8.880***
2.90
2.43
0.46
5.357***
2.42
2.04
0.38
4.176***
1.62
1.26
0.36
3.832***
2.98
2.18
0.80
7.483***
2.52
2.10
0.42
5.168***
1.04
1.02
0.02
1.000
Note: The response scale for questions in this table is: 1=Never, 2=Infrequently, 3=Sometimes, 4=Frequently, 5=Always. r0.01.
Using Electronic Audit Workpaper Systems in Audit Practice
47
(mean=2.43). However, in each case, practice with the system significantly reduces the incidence of attempts to work around the system. Results in Panel B show that reviewers also report moderate incidence of two work-around behaviors on the first few engagements (i.e., printing out workpapers and creating review notes on paper). The means of 2.98 and 2.52 represent an incidence of these behaviors between ‘‘infrequent’’ and ‘‘sometimes.’’ (Very few instances of terminating an engagement are reported.) The mean responses to these questions judged after getting used to the system are 2.18 and 2.10, respectively. For both behaviors, practice significantly reduced the reported incidence. However, there are some reports of work-around behaviors by both preparers and reviewers even after getting used to the system.
CONCLUSIONS This paper describes component tasks involved in preparing and reviewing audit workpapers using a fully integrated electronic audit workpaper system. We investigate relative difficulty and frequency of performance of these component tasks, the learning curve for electronic workpaper systems, and ways in which auditors try to work around the system following implementation. The findings provide considerable information about the processes of constructing and reviewing audit workpapers in electronic environments. However, this paper’s methods have several limitations that provide possibilities for future research. First, only a single system is studied, for one audit firm. While the taxonomy of component tasks is likely generalizable, findings on relative difficulty and frequency of component tasks may vary in different systems. Therefore, additional evidence from other audit firms and across various types of electronic audit workpaper systems would be useful in further describing contemporary audit practice. Second, system developers provided information on ways in which users might work around the system. While these individuals are most knowledgeable about specific operations of the system, they might not be aware of all possible ways that users might devise to work around it. Third, there is a possibility of response bias, in that individuals who were uncomfortable reporting work-around behavior might not have responded to the survey or might not have responded honestly. If so, the results would understate the actual frequency of work-around behaviors in practice. Lastly, participants had used the system under study for one to two years. There is improvement in system use over that period, but some resistance remains. While further
48
JEAN C. BEDARD ET AL.
use may possibly result in complete conformation to system use guidelines, this research is still useful in identifying areas of relative difficulty after an extensive period of practice. The first set of results relates to relative difficulty of the 38 tasks performed by workpaper preparers (seniors and staff), and 28 tasks performed by workpaper reviewers (managers and partners) within the electronic workpaper system. These tasks are classified into categories of system security, data input (setting up the engagement file), organization of the file and verification of data, and review. Results show that reviewers consider working with the electronic system to be more difficult than do preparers. The results also reveal that the tasks that seem most difficult for both preparers and reviewers involve navigating around the electronic system. For example, almost half of preparers report difficulty completing tasks such as tracing amounts from the financial statements to lead sheets/ workpapers, determining that workpapers/memos have been prepared for all significant account balances, and ensuring that workpapers are updated for adjusting journal entries. Further, when auditors compared the difficulty of tasks in electronic and paper environments, the results also reveal that ‘‘navigation’’ tasks are particularly difficult. The finding of persistent navigation difficulties, even after one to two years of using a system carefully designed to assist file construction and review, reinforces the results of Bible et al. (2005) on navigation problems in electronic audit workpaper systems. Therefore, system design, implementation, and training need to be especially targeted toward addressing this difficulty. The second set of results relates to transition and learning effects. Correlations of task difficulty and frequency ratings show that completing tasks more frequently within the electronic system is helpful in reducing task difficulty for workpaper preparers, but not for reviewers. Since preparers spend more time actually using the workpapers in their job role (Rich et al., 1997), it may be that while frequency improves difficulty perceptions for these auditors, this relationship takes longer to develop for reviewers because they simply spend less time on the task. Regarding the overall learning curve associated with the new system, results show that preparers need between four and five engagements before feeling comfortable using the new system, whereas reviewers need between six and seven engagements. Auditors also report a significant increase in using the full capabilities of the system once they become familiar with it. Once familiarity is gained, preparers indicate greater use of the system’s full capabilities than reviewers. These findings suggest that training using highly realistic cases is important, and that oversight or peer review may be appropriate to
Using Electronic Audit Workpaper Systems in Audit Practice
49
ensure quality on engagements when teams transition to a new electronic system. The third set of findings concerns resistance issues. While the use of selfreports to capture these behaviors might under-represent their incidence, personnel at all levels report some behaviors indicating ‘‘working around’’ the system (e.g., creating review notes on paper outside of the electronic system or storing documentation in a paper binder instead of on the electronic system). Both preparers and reviewers report reduced incidence of working around the system as they gained familiarity with it, but some auditors still report such behavior even after they have gained significant familiarity with the system. This variation in how audit evidence is documented is potentially important because it may result in difficulties in later retrieving evidence for internal quality control and for PCAOB inspection teams.13 The possibility of subsequent documentation problems has greater import under Auditing Standard No. 3 than under previous auditing standards. Audit firms currently using electronic audit workpaper systems, and those transitioning to such systems, should be aware of these findings and should make efforts to ensure that auditor resistance does not result in failure to comply with professional standards. For researchers, this study’s findings motivate emerging research on the audit effectiveness/efficiency implications that may be associated with electronic audit workpaper systems. Bible et al. (2005) summarize this literature by noting that prior research has not demonstrated that electronic environments facilitate information processing (e.g., Dillon, 1996). To the contrary, findings of studies within and outside of auditing are consistent in showing performance problems associated with the cognitive load involved in navigating around electronic environments, thus causing disorientation (e.g., McDonald & Stevenson, 1996). While Bible et al. (2005) demonstrate performance decrements associated with electronic environments, Rosman et al. (2006) find that specific decision processes overcome this difficulty. Further, studies outside of auditing such as Mills et al. (2002) show that greater domain knowledge is associated with better ability to navigate through a hypertext environment. Complementing these studies, this research shows that difficulties with performing some audit tasks on a new system decline with practice, but do not disappear completely. From an audit effectiveness perspective, future research could investigate individual auditor characteristics that influence task difficulty, and how task difficulty perceptions subsequently affect individual auditor decision-making (e.g., during the workpaper review process). In addition, research could investigate the extent to which avoiding electronic workpaper systems by ‘‘working
50
JEAN C. BEDARD ET AL.
around’’ them affects auditor decision-making and the required documentation of audit evidence, and whether these behaviors persist in mature systems. From an efficiency perspective, studies could investigate the cost-benefit tradeoffs associated with the shift to electronic audit workpaper systems, and how the learning curve on the new systems affects audit efficiency. Studies comparing training methods might also be directed toward auditing students, as effective preparation before entering the workplace will ease the transition for students and reduce cost to firms once they are employed. In addition to the above research implications, this study’s findings also contribute to audit practice and education. For audit practice, the results provide insight on implementation of electronic audit workpaper systems, including information about the tasks completed within the system, about auditors’ perceptions of the difficulty of those tasks, and auditors’ reports on how frequently they use those features of the system. This information should be useful to other audit firms as they design and update their own electronic audit workpaper systems. Further, the results provide evidence on the transition and learning issues associated with the adoption of an electronic audit workpaper system, and provide evidence on the existence and nature of auditor behaviors associated with resisting the new system. Understanding these features should assist system developers and audit firm personnel as they consider potential implementation costs and training needs associated with new electronic audit workpaper systems. In addition, educators will find these results useful to share with students in their descriptions of current practices in audit evidence documentation.
NOTES 1. Using electronic audit records to maintain engagement documentation has gained importance due to the PCAOB inspection process and the requirements of Auditing Standard No. 3 (PCAOB, 2004). 2. The participating audit firm wishes to remain anonymous. 3. Working around the system could be the result of a poorly designed system. However, this is unlikely for the system we study. In general, users do not rate the system as difficult to use (documented in Table 1). In addition, incidence of workaround behaviors is reduced as users gain experience with the system (documented in Tables 3 and 4). This evidence is consistent with initial individual user recalcitrance and/or inexperience, rather than as resulting from poor system design. 4. This paper presents a taxonomy of the component tasks involved in preparing and reviewing audit workpapers in an electronic environment, which should prove useful to audit research and practice. This taxonomy builds on that of Abdolmohammadi (1999) by providing a more detailed breakdown of workpaper
Using Electronic Audit Workpaper Systems in Audit Practice
51
preparation and review tasks, and by focusing specifically on the electronic environment. 5. This research extends a prior study of effects of training on auditors’ acceptance of the newly implemented electronic workpaper system (Bedard et al., 2003). The prior study uses structural equation modeling to investigate relationships posed by the Technology Acceptance Model, examining the roles of system perceptions and self-efficacy (related to both technology and task) on behavioral intention to use the system in the manner intended by its developers. 6. These factors include the extent of working memory capacity (Lee, Tedder, & Xie, 2006), cognitive style (Graff, 2006), navigation strategy (Pratt, Mills, & Kim, 2004), and reading strategy (Salmeron, Canas, Kintsch, & Fajardo, 2005). 7. Personnel of the firm providing data informed us that audit staff and seniors are primarily responsible for preparing the audit workpapers within the new system, so these individuals are collectively termed ‘‘preparers.’’ Audit managers and partners are responsible for reviewing completed audit workpapers within the new system, so these individuals are termed ‘‘reviewers.’’ 8. In the current study, this comparison is facilitated because the firm did not implement any change in the underlying audit process and the objectives of that process during the period in which the system was introduced. 9. In the table, the component tasks are ordered in decreasing mean difficulty ratings by major task category: system security, data input, organization/verification, and review. 10. Because the data are from offices that had been using the electronic system for either one or two years, it is important to test whether difficulty or frequency ratings differ between these groups of offices. Results of independent samples t-tests on difficulty and frequency ratings for preparers and reviewers show that the ratings do not differ between groups. 11. To test whether the ordering of task difficulty ratings is sensitive to scaling, each participant’s difficulty score was divided by their individual average difficulty assessment across all tasks in the electronic system. The results of this test are consistent with our reported results. 12. While reviewers experience greater difficulty than preparers, this research does not pinpoint the precise source of the difference, as preparers and reviewers differ on several dimensions including generation/age differences, extent of prior experience performing audits on paper systems, and the nature of the task performed. 13. The firm’s procedures require that engagement team members not resort to paper processing for tasks encompassed by the system. The system has built-in checks to ensure that all work is completed, and the electronic workpaper is the repository of engagement information that is subject to both internal and external review by higher-level individuals within the engagement team, PCAOB inspectors, and peer reviewers.
ACKNOWLEDGMENTS We thank the participating firm for providing data in support of this project, and for their continuing interest in the outcomes of our research.
52
JEAN C. BEDARD ET AL.
We appreciate the comments of the Editor, anonymous reviewers, and Steve Sutton, which helped us improve the paper.
REFERENCES Abdolmohammadi, M. J. (1999). A comprehensive taxonomy of audit task structure, professional rank, and decision aids for behavioral research. Behavioral Research in Accounting, 11, 52–92. Abdolmohammadi, M., & Usoff, C. (2001). A longitudinal study of applicable decision aids for detailed tasks in a financial audit. International Journal of Intelligent Systems in Accounting, Finance, and Management, 20(3), 139–154. Azjen, I. (1991). The theory of planned behavior. Organizational Behavior and Human Decision Processes, 50, 179–211. Bedard, J., Ettredge, M., Jackson, C., & Johnstone, K. (2003). The effect of training on auditors’ acceptance of an electronic work system. International Journal of Accounting Information Systems, 4(4), 227–250. Bible, L., Graham, L., & Rosman, A. (2005). Comparing auditors’ performance in paper and electronic work environments. Journal of Accounting, Auditing and Finance, March, 27–42. Brazel, J., Agoglia, C., & Hatfield, R. (2004). Electronic vs. face-to-face review: The effects of alternative forms of review on auditors’ performance. The Accounting Review, 79(4), 949–966. Chau, P. Y. K. (1996). An empirical investigation on factors affecting the acceptance of CASE by systems developers. Information and Management, 30, 269–280. Davis, F., Bagozzi, R., & Warshaw, P. (1989). User acceptance of computer technology: A comparison of two theoretical models. Management Science, 35(8), 982–1003. Dillon, A. (1996). Myths, misconceptions, and an alternative perspective on information usage and the electronic medium. In: J. Rouet, J. J. Levonen, A. Dillon & R. J. Spiro (Eds), Hypertext and cognition. Mahwah, NJ: Lawrence Erlbaum Associates. Glover, S., Prawitt, D., & Romney, M. (2000). The software scene. The Internal Auditor, 57(4), 49–57. Graff, M. (2006). Constructing and maintaining an effective hypertext-based learning environment: Web-based learning and cognitive style. Education and Training, 48(2/3), 143–155. Hartwick, J., & Barki, H. (1994). Explaining the role of user participation in information system use. Management Science, 40(4), 440–465. Jonassen, D. H. (1990). Semantic network elicitation: Tools for structuring hypertext. In: R. Maltese & C. Green (Eds), Hypertext: The state of the art. London: Intellect. Kim, S., & Malhotra, N. (2005). A longitudinal model of continued IS use: An integrative view of four mechanisms underlying postadoption phenomena. Management Science, 51(5), 741–755. Lee, M., Tedder, M., & Xie, G. (2006). Effective computer text design to enhance readers’ recall: Text formats, individual working memory capacity and content type. Journal of Technical Writing and Communication, 36(1), 57–73. Markus, M. L. (1983). Power, politics, and MIS implementation. Communications of the ACM, 26(6), 430–444.
Using Electronic Audit Workpaper Systems in Audit Practice
53
McCollum, T., & Salierno, D. (2003). Choosing the right tools. The Internal Auditor, August, 32–43. McDonald, S., & Stevenson, R. J. (1996). Disorientation in hypertext: The effects of three text structures on navigation performance. Applied Ergonomics, 27, 61–68. Mills, R. J., Paper, D., Lawless, K. A., & Kulikowich, J. M. (2002). Hypertext navigation – An intrinsic component of the corporate intranet. Journal of Computer Information Systems, Spring, 44–50. Nielsen, J. (1990). The art of navigating through hypertext. Communications of the ACM, 33, 296–309. Pratt, J., Mills, R., & Kim, Y. (2004). The effects of navigational orientation and user experience on user task efficiency and frustration levels. The Journal of Computer Information Systems, 44(4), 93–100. Public Company Accounting Oversight Board (PCAOB). (2004). Auditing Standard No. 3: Audit Documentation. Washington, DC. Rich, J. S., Solomon, I., & Trotman, K. T. (1997). The audit review process: A characterization from the persuasion perspective. Accounting, Organizations and Society, 22(5), 481–506. Rosman, A., Bible, L., Biggs, S., & Graham, L. (2007). Successful audit workpaper review strategies in electronic environments. Journal of Accounting, Auditing and Finance, 22 (1), 57–83. Rothman, S. (1997). Expert: Paperless audit saves money. The Credit Union Accountant, November 24, 1. Salmeron, L., Canas, J., Kintsch, W., & Fajardo, I. (2005). Reading strategies and hypertext comprehension. Discourse Processes, 40(3), 171–191. Sellen, A. J., & Harper, R. H. (2002). The myth of the paperless office. Cambridge, MA: The MIT Press. Spiro, R., Coulson, R. I., Feltovich, P. J., & Anderson, D. K. (1988). Cognitive flexibility theory: Advanced knowledge acquisition in ill-structured domains. Proceedings of the Tenth Annual Conference of the Cognitive Science Society (pp. 375–383). Stanton, N. A., & Stammers, R. B. (1990). Learning styles in a non-linear training environment. In: R. McAleese & C. Green (Eds), Hypertext: The state of the art. Oxford: Intellect. Thuring, M., Hannemann, J., & Haake, J. M. (1995). Hypermedia and cognition: Designing for comprehension. Communications of the ACM, 38, 57–66. Trotman, K. (2005). Discussion of M. Nelson and H.-T. Tan, ‘‘Behavioral review judgment and decision-making research in auditing: A task, person and interaction perspective’’. Auditing: A Journal of Practice and Theory, 24(Supplement), 73–87. Venkatesh, V., Morris, M., & Ackerman, P. (2000). A longitudinal field investigation of gender differences in individual technology adoption decision-making processes. Organizational Behavior and Human Decision Processes, 83(1), 33–60. Yang, D. C. (1993). The use of microcomputer software in the audit environment: The implications for accounting education. Journal of Applied Business Research, Summer, 29–31.
This page intentionally left blank
LIMITED ATTENTION AND INDIVIDUALS’ INVESTMENT DECISIONS: EXPERIMENTAL EVIDENCE Bryan K. Church and Kirsten Ely ABSTRACT This study examines whether users’ investment decisions are affected by management’s presentation of software development costs (capitalizing versus expensing the costs) when information from an external source is available (i.e., an analyst’s report). Professional accounting standards mandate that firms capitalize such costs once technological feasibility is achieved (cost recovery is likely). Hence, the implication of capitalizing versus expensing differs, with the former implying a greater expected future benefit. Nonetheless, limited attention suggests that users (particularly novices) may not attend to the accounting for software development costs. An experiment is designed to investigate this issue. The findings indicate that participants pay little attention to the reporting of software development costs and the implied message. Instead, participants’ decisions appear to be driven by their reliance on information from an external source. Participants invest more when an analyst’s report suggests that cost recovery is likely as opposed to tenuous, regardless of how the firm accounts for software development costs. Advances in Accounting Behavioral Research, Volume 10, 55–75 Copyright r 2007 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1016/S1475-1488(07)10003-X
55
56
BRYAN K. CHURCH AND KIRSTEN ELY
INTRODUCTION An important accounting issue in today’s knowledge-based economy is whether standard-setting bodies can write rules that allow for the credible capitalization of soft assets (intangibles). The significance of soft assets is well established, yet the method to account for them is subject to much debate (e.g., Caniban˜o, Garcı´ a-Ayuso, & Sa´nchez, 2000; Lev, 2001). The uncertainties and ambiguities that plague soft asset measurement create concerns as to whether capitalized values are meaningful. The usefulness of capitalizing soft assets hinges on the extent to which users attend to such information, including the accompanying footnote disclosures. Under U.S. GAAP, the only category of internally developed soft assets that permits capitalization is software development. Statement of Financial Accounting Standards (SFAS) No. 86 (FASB, 1985) requires the capitalization of software development costs that occur after technological feasibility, subject to a net realizable value constraint. All costs incurred prior to technological feasibility must be expensed as research and development costs. For users, capitalized costs reflect an asset that is expected to produce a future benefit with reasonable certainty. Hence, SFAS 86 is intended to reflect the substance underlying the incurrence of software development costs. However, the decision to capitalize and the subsequent income effects of amortization are based on management’s judgment. The accounting treatment of software development costs is affected by management’s assessment of the achievement of technological feasibility, market and financial risk factors that impact cost recovery, and revenue projections. The information explaining whether costs have been capitalized and the values necessary to reconcile the software asset are disclosed in the footnotes. As such, understanding the implications of the capitalize/expense decision requires users to pay attention to the footnote information concerning these factors and to interpret the information in the context of the firm’s products and market. Under efficient markets theory, all publicly available information is incorporated into security prices, suggesting that users can discern the relevance of transparent accounting disclosures. Because software cost footnote disclosure is publicly available, users should be able to discern the relevance of a firm’s software costs. But research in decision-making and psychology indicates that individuals do not always attend to relevant or diagnostic information (i.e., individuals are subject to cognitive limitations). Along these lines, Hirshleifer and Teoh (2003) assert that users focus on more salient information, largely ignoring other, less-salient information.
Limited Attention and Individuals’ Investment Decisions
57
They show that, due to users’ limited attention, information that is not salient is not fully incorporated into security prices. The implication is that, if accounting information is not salient, users may not attend to the information and it may not be reflected in security prices. The current study investigates this possibility in the context of the accounting for software development costs. In light of limited attention (Hirshleifer & Teoh, 2003), users may not fully consider the implications of software development cost accounting. As mentioned earlier, the explanation underlying the presentation of such costs, along with additional information that is necessary for transparency, is disclosed in the footnotes. Users’ ability to access and process the information may be constrained by their skill and desire. They may not heed the implications of accounting information directly if competing sources of information are available (e.g., analysts’ reports). Rather, due to limited attention, they may focus on other, more salient information, including media stories and/or analysts’ reports, rendering the accounting method/ disclosure moot. Using the literature on limited attention as a theoretical guide, an experiment is conducted to examine users’ investment in a firm that relies on software development for future success. The focus is on individuals’ investment decisions because more than 41 million nonprofessionals invest directly in the U.S. stock markets (Securities Industry Association, 2002). This subset of the market can have significant effects on security prices. Research that examines the behavior of nonprofessionals can shed insight into the functioning of financial markets and institutions (e.g., Bossaerts, 2001; Hirshleifer, 2001). The experimental setup permits the study of how users’ investment decisions are affected by the presentation of software development costs (capitalize versus expense) and the prospects of such costs as suggested by a competing source of information (favorable versus unfavorable). If users heed the accounting presentation of software development costs, they should invest more in a firm that capitalizes such costs as opposed to expenses the costs. If users heed the advice of a competing information source regarding the prospects of software development costs, they should invest more when the advice is favorable. In the experimental setting, an analyst’s report serves as the competing source of information. Although analysts do not have the same information as managers, their reports may be sufficiently salient such that users attend to their reports. The experimental setup allows for the determination of whether users attend to the accounting presentation of software development costs, the prospects
58
BRYAN K. CHURCH AND KIRSTEN ELY
of such costs as indicated in an analyst’s report, or both sources of information. The experimental results suggest that users pay attention to the prospects of software development costs as indicated in the analyst’s report, but not to the accounting presentation of such costs. The results suggest that the information conveyed in the analyst’s report is more salient than the accounting information and, in turn, more likely to affect users’ investment decisions. Thus, standard setters need to consider the salience of accounting information when pushing for the adoption of particular accounting methods. The remainder of the paper is organized as follows. The second section includes the motivation and development of hypotheses. The third and fourth sections describe the experimental procedures and present the experimental results, respectively. Finally, the fifth section concludes by discussing the results, along with suggestions for future research.
MOTIVATION AND HYPOTHESIS DEVELOPMENT Various experimental studies provide evidence that users may not carefully attend to accounting disclosures. Hirst and Hopkins (1998) report that one half of the participants in their study (equity analysts and portfolio managers) fail to recognize that comprehensive income is included in the statement of changes in stockholders’ equity. Hodge, Kennedy, and Maines (2004) documents that a significant proportion of participants (second-year MBA students) do not correctly recognize how target firms account for stock options (recognition of expense versus disclosure). Elliott, Hodge, Kennedy, and Pronk (2007) produces a similar finding using a broader set of participants: entering MBA students, second-year MBA students, and investors. This evidence is inconsistent with the traditional assumption of rational man – who fully considers all relevant, publicly available information, regardless of placement. Consistent with the evidence of the aforementioned studies, Hirshleifer and Teoh (2003) use the concept of limited attention to explain why the form and placement of accounting information affects the way users incorporate such information. Hirshleifer and Teoh draw on the psychology literature to show how limited attention relates to accounting information and, in turn, point out the importance of salience. Investment in software development firms is an interesting context to study the possible consequences of limited attention, due to the accounting for software development costs, as prescribed in SFAS 86.
Limited Attention and Individuals’ Investment Decisions
59
Three elements are important in considering the application of limited attention to a context that involves SFAS 86. First, the accounting for software development costs has direct investment implications that should be of interest to users. Second, relevant information on software development costs, necessary to fully understand the line item amounts on the face of financial statements, is disclosed in the footnotes, which may not be particularly salient (see e.g., Bloomfield & Libby, 1996). Third, analysts’ reports, an alternative source of information, are readily available and salient. The first two elements require users to process information, an activity that may fall victim to limited attention, particularly given the availability and salience of analysts’ reports. This section briefly reviews Hirshleifer and Teoh’s application of limited attention to accounting disclosures and then develops hypotheses concerning the implications for the information presented in SFAS 86 and for an analyst report as competing information.
Limited Attention Hirshleifer and Teoh’s (2003) underlying assumption is that the time and attention taken to process accounting information is costly. Therefore, each user has a less than 100% probability of attending to a specific piece of accounting information, regardless of the information’s relevance. In this context, Hirshleifer and Teoh discuss five points that are particularly relevant for this paper, all of which hinge on the idea that people focus on more salient information. First, users’ attention is drawn towards salient information, including information that is prominent and/or easily summarized (Fiske & Taylor, 1991; Nisbett & Ross, 1980). Ackert, Church, and Shehata (1996) find that users acquire processed information more frequently than unprocessed information, supporting the notion that easily summarized data draws users’ attention.1 Following this line of argument, most footnote disclosures are not prominent (by definition) and typically consist of facts, devoid of conclusions or summaries. Thus, information contained in footnotes is not particularly salient and, thus, not likely to draw the attention of users. Second, the amount of attention paid to a piece of information may not necessarily reflect the economic importance of the information due to users’ tendency to underweight abstract and statistical information (Kahneman & Tversky, 1973; Nisbett & Ross, 1980). As the lack of context in most footnotes renders the information abstract, users may not absorb the full
60
BRYAN K. CHURCH AND KIRSTEN ELY
economic importance of such disclosures, especially as the information relates to a firm’s financial performance. Third, users are apt to use information in the way that it is presented, as opposed to processing the information further (Slovic, 1972; Payne, Bettman, & Johnson, 1993). This point reiterates the previous one that users prefer processed or summarized information, which in turn is more likely to draw users’ attention. Again, the implication is that accounting information that relies on footnote disclosures to be fully understood is not salient – as the information must be assimilated and processed further. To summarize, accounting information that relies on footnote disclosures is not likely to draw the attention of users. Hirshleifer and Teoh also consider the effect of salience on the relative use of information. Evidence suggests that ease of retrieval (from long-term memory) affects the attention paid to information (Tversky & Kahneman, 1973). If salience makes information easier to retrieve, more attention will be paid to the information. Finally, in experiments requiring subjects to process information, more salient information overshadows less salient information in decision-making, regardless of economic importance (Kruschke & Johansen, 1999). Thus, in the face of more salient competing information, accounting information and, in particular, footnote disclosures will draw less attention from users.
SFAS 86 and Limited Attention The very aspects of SFAS 86 that allow it to reflect the substance of how software development costs impact a firm also make it more likely that users will pay less attention to the information than is warranted. Under SFAS 86, software development costs are treated differently depending upon whether technological feasibility has been achieved. All costs incurred prior to technological feasibility are expensed as research and development while costs incurred after technological feasibility are capitalized. Technological feasibility is determined in one of two ways: roughly speaking, upon the completion of a detailed program design or, if the firm’s process does not include such a design, upon completion of a working model. Generally, the completion of a working model occurs later in the process so that firms using this benchmark are apt to capitalize fewer costs. In fact, a number of notable and successful software firms (e.g., Microsoft, Visio, Yahoo, AmDocs, Novell, and Saga Systems) do not capitalize any software development costs as a result of the working model benchmark.
Limited Attention and Individuals’ Investment Decisions
61
In addition to the technological feasibility benchmark, there is the additional caveat that costs can only be capitalized if they are recoverable given various market and financing risks. Furthermore, for any costs that are capitalized, amortization is dependent on management’s projection of future revenues. The information explaining whether costs have been capitalized and the values necessary to reconcile the software cost asset are disclosed in the footnotes. The complexity of the standard allows management to reflect their judgment of the substance of the costs incurred – whether the costs reflect a future benefit – by capitalizing only those costs that meet the criteria. However, for users to understand the implications of this message, users must pay attention to the difference in how the costs are treated over time and process the footnote and financial statement information in light of the accounting method’s guidelines. Moreover, because of the judgment allowed under the standard, users must consider the incentives of management in their analysis. Therefore, despite the requirements in SFAS 86 for transparency, software development cost accounting is likely to have low salience due to the prominence and processing points raised by Hirshleifer and Teoh (2003). The information necessary to evaluate the accounting information (on software development costs) is in the footnotes, and so is not prominent; it is disclosed as a set of monetary values with little context making it relatively abstract. Combining this information with that contained on the face of financial statements (i.e., line item amounts) and then analyzing the importance in light of the guidelines in SFAS 86 can be fairly taxing in terms of information processing. Because of low salience, relevant information on software development costs may draw less user attention than information provided by a competing, more salient source. To determine whether users pay attention to the accounting presentation for software development costs, the following hypothesis is examined. H1. Investment decisions are not affected by the accounting treatment of software development costs which reflects management’s expectations concerning future prospects, contrary to the intent of SFAS 86.
Analysts’ Reports as Competing Information The availability of analysts’ reports increases the likelihood that limited attention will affect users’ incorporation of software development costs, as
62
BRYAN K. CHURCH AND KIRSTEN ELY
reflected in financial statements, because analysts’ reports are expected to be more salient. Going back to the points raised by Hirshleifer and Teoh (2003), analysts’ reports are both prominent and easily summarized. Such reports are made prominent by frequent reference in the media and are readily available for software development firms. In addition, analysts’ reports allow users to avoid onerous processing, including assimilating and assessing information disclosed on the face of financial statements and in the footnotes, which is necessary to determine the future prospects for firm performance. Analysts’ reports also contain a concise summary narrative or recommendation, which is easy to digest. Moreover, as analysts are not subject to the same incentives as managers, users can assume that analysts look at firm performance through a more objective lens and, thus, take into consideration the possible effects of management incentives. Therefore, analysts’ reports are a salient source of competing information and, in particular, are of greater salience than software accounting disclosures (included on the face of financial statements and in the footnotes). To examine whether users pay attention to the information in analysts’ reports, the experiment described in the next section tests the following hypothesis. H2. Investment decisions are affected by an analyst’s report regarding the future prospects for software development costs, irrespective of the accounting for such costs. More funds will be invested when the analyst’s report provides a favorable assessment of the prospects for software development costs than an unfavorable assessment.
EXPERIMENTAL METHOD An experiment is designed to examine the two hypotheses of interest. A 2 2 experimental design is used to examine the effect of management’s presentation of software development costs and an analyst’s assessment of the future benefit of such costs on participants’ investment decisions. An important feature of the experiment is that participants make decisions that have real economic consequences (i.e., participants’ choices affect their final wealth). This approach allows for an investigation of participants’ actions as opposed to beliefs or behavioral intentions. Hence, the study addresses whether management’s presentation and the analyst’s assessment have a meaningful effect on participants’ investment decisions. Below, the participants, procedures, and experimental manipulations are described.
Limited Attention and Individuals’ Investment Decisions
63
Participants Ninety-eight students at a medium-sized university in the U.S. are recruited to participate in the experiment. The students are recruited via announcements in second-year MBA classes and selected, undergraduate accounting electives. The participants include 78 graduate students and 20 undergraduates. The undergraduates consist of 14 seniors, 4 juniors, and 2 sophomores. Almost all participants are enrolled in a program of study in business (92 of 98). Participants have completed or are currently enrolled in an average of 4.0 accounting courses.2 In addition, students have completed or are currently enrolled in an average of 3.2 finance courses, and 59 of 98 (60%) have previously invested in stock. The students represent reasonable users of financial accounting information as suggested by the Financial Accounting Standards Board’s (FASB) Concepts Statement No. 1 (FASB, 1978, pn. 34). Specifically, users are those who have a reasonable understanding of business and economic activities and are willing to study the information with reasonable diligence. The students certainly satisfy these criteria.
Procedures The experiment is administered to participants in groups of 3–12 over a total of 12 sessions. All participants in a session receive the same experimental materials (i.e., they are all in the same experimental treatment). The authors administer all 12 sessions, conducted at various times outside of class time, over roughly a four-week period. To assess whether information may have been transmitted across sessions, comparisons are made using participants’ responses in early and late sessions. The data do not provide any evidence of differences, suggesting that transmission of information is not a problem. Comparisons also are made to test for demographic differences (e.g., sex, age, year in university, accounting courses taken, finance courses taken, investing experience, etc.) across treatments. No statistically significant differences are found ( p>0.10), which suggests that the mix of students is similar across experimental groups. At the beginning of each session, instructions are distributed and read aloud. The instructions take about 15 min to go through. The instructions note that the experiment is designed to examine individuals’ willingness to invest in technology companies – specifically those that spend significant amounts on research and development and, in turn, place a priority on new
64
BRYAN K. CHURCH AND KIRSTEN ELY
product development. The instructions summarize the accounting treatment for product development costs. By doing so, all participants are aware of the applicable accounting rules and have a fundamental knowledge of the prescribed treatment. In this respect, the accounting for software development costs and its implications for the future prospects of a firm’s software prospects are likely more salient for participants than for the average user, creating a bias against finding evidence to support H1. According to the instructions, participants are endowed with $1 m and asked to allocate the funds among four assets: the common stock of three companies and cash. Participants can choose not to invest in a particular asset, but they cannot short sell. The three companies include a biotechnology firm, a telecommunications firm, and a software firm (the target). The companies are real firms listed on the NASDAQ stock exchange, though the names are disguised. Participants are provided with a set of information on each company, including comparative financial statements, note disclosure of product development costs, excerpts of item 1 (Business) of the 10-K (particularly as it relates to new product development), stock price history (over the past four quarters), and excerpts from an analyst’s report. For the most part, the information is abstracted from reports filed with the Securities and Exchange Commission. Specific dates are not included in the information provided to participants; rather years are referred to generically as 1, 2, and 3. Specific dates (years) are not used to control for knowledge differences between participants as dramatic swings in technology-oriented stocks have occurred in the recent past. Participants are instructed that the information for the three companies is from the same time period. To give participants a sense of the happenings in the marketplace, data are provided on several economic indicators for the three most recent years, including gross domestic product, the NASDAQ composite index, the unemployment rate, and the three-month treasury-bill rate. Participants are given 45 min to review the information on the three companies and allocate their endowment. Very few participants take the full amount of time, suggesting that the time allotment is adequate. Casual observation is that students take approximately 30 min to complete the task. Participants are told the current price of each stock and instructed that they are investing for a three-month period. Participants generate a return for funds invested in common stocks. The return is computed based on the actual, closing price quoted three months after the filing of financial statements (i.e., the return is the change in price over the three-month period divided by the current price). Participants are permitted to allocate funds to
Limited Attention and Individuals’ Investment Decisions
65
cash, though the return is zero. Participants generate earnings based on their investment decisions.3 They are paid a show-up fee of $10 plus 0.000015 times their ending investment, which represents participants’ take-home pay. The average take-home pay is $30.74. After all participants in a session make their portfolio allocation decisions, an investment questionnaire is distributed. On the questionnaire, participants rate the importance of various factors in making their decision to invest in a particular company, including earnings history, the analyst’s report, the accounting method for product development costs, industry membership, and management’s outlook. The ratings are on a seven-point scale anchored by 1=not important and 7=very important. Subsequently, a post-experiment questionnaire is administered, which collects general demographic information. The questionnaire also asks participants to indicate the accounting method used for product development costs, rate the analyst’s characterization of prospects for new products (1=questionable and 9=probable), and describe how they made their investment decision (in an open-ended response). After all materials are completed, the closing price for each stock is announced. Participants compute their earnings and are paid and dismissed.
Experimental Manipulations Information on the target (software) firm is manipulated between participants. Otherwise, participants receive identical information; that is, the instructions and materials on the biotechnology and telecommunications companies do not vary between participants. Management’s presentation of software development costs is manipulated. One half of the participants receive information that the target firm capitalizes software development costs. Under capitalization, the financial statements include line items for product development costs. On the balance sheet, the costs are shown in non-current assets, net of accumulated amortization. On the income statement, capitalized costs are included in cost of sales and non-capitalized costs are shown in operating expenses (i.e., those incurred prior to technological feasibility). On the statement of cash flows, the capitalized costs are shown as outflows from investing activities. A note disclosure also details the costs capitalized and charged to income in each of the past three years. For the other half of the participants, the target firm expenses software development costs. Under expensing, only the income statement includes a
66
BRYAN K. CHURCH AND KIRSTEN ELY
line item and the note disclosure indicates that the costs subject to potential capitalization were not material in any of the past three years. The accounting treatment of software development costs affects total assets, stockholders’ equity, and net income: the amounts are larger when costs are capitalized as opposed to expensed. The amounts are allowed to differ between participants because the differences are inherent in the decision to capitalize or expense costs. However, the experimental materials are such that the trends in amounts, and relative comparability of the target firm to the other companies, are unaffected by the accounting treatment of software development costs. In both treatments, assets and stockholders’ equity are declining and net income is negative and decreasing over time. The content of the analyst’s report is also manipulated such that half the participants receive a report that suggests future benefits associated with software development costs are likely and the other half tenuous. In the former, the analyst’s report states that it is likely the firm will find a ready market for its new product sufficient to recover costs. In the latter, the analyst’s report states that the prospects for cost recovery are tenuous.
RESULTS Main Findings The average investment, partitioned by experimental group, is presented in Panel A of Table 1. Consistent with H2, participants invest more in the software firm when the analyst’s report indicates that future benefits are likely as opposed to tenuous – approximately 36% more. In contrast and consistent with H1, the accounting treatment appears to have a minimal effect on the average investment in the software firm. Contrary to the implications inherent in the accounting treatment, participants invest slightly more when software developments costs are expensed as opposed to capitalized. To formally test for differences, an analysis of variance (ANOVA) is conducted. The dependent variable is participants’ investment in the software firm. The independent variables include the analyst’s assessment of future benefits, management’s presentation of software development costs, and the interaction term. The ANOVA results are presented in Panel B of Table 1. Only the analyst’s assessment is statistically significant at conventional levels ( po0.05). Thus, consistent with both hypotheses, the statistical significance suggests that participants focus on the content of the
Limited Attention and Individuals’ Investment Decisions
Table 1.
67
Participants’ Investment Allocations to the Target Firm.
Panel A: Experimental group means Experimental Groups
Analyst’s assessment Overall
Management’s Presentation
Likely Tenuous
Capitalize
Expense
$359,375 $292,250 $325,813
$405,000 $269,424 $337,212
Overall
$382,653 $280,604 $331,629
Panel B: ANOVA results Source
DF
F-statistic
p-value
Management’s presentation of software development cost Analyst’s report Interaction Error
1 1 1 94
0.052 4.105 0.468
0.820 0.046 0.496
analyst’s report and pay little, if any, attention to managements’ presentation of software development costs. To gain further insight into these findings, participants’ responses to the post-experiment questionnaire are examined.
Attention to Analyst’s Report and Management’s Presentation Participants rate the analyst’s assessment of the prospects for new product development on a nine-point scale, where 1=questionable prospects and 9=probable prospects. The descriptive findings for participants in each experimental treatment are reported in Panel A of Table 2. Participants are more optimistic when the analyst’s report suggests that future benefits are likely as opposed to tenuous, and the difference is statistically significant (t=3.43, p=0.001). The perceived content of the analyst’s report differs between participants, as expected. Participants also recall how the companies accounted for new product development costs. Participants respond whether the costs of new product development are (1) expensed as incurred or (2) capitalized and expensed over future periods. The frequency of responses for participants in each experimental treatment is summarized in Panel B of Table 2. For the target firm, the majority of participants recall that the target firm capitalizes software development cost, regardless of management’s presentation of the costs. Most participants in the capitalize treatment correctly recall
68
BRYAN K. CHURCH AND KIRSTEN ELY
Table 2.
Participants’ Assessment of Experimental Manipulations.
Panel A: Content of the analyst’s report regarding new product development Treatment
Mean (Standard Deviation)
Future benefits likely Future benefits tenuous
6.30 (1.91) 4.87 (2.03)
Panel B: Recall of management’s presentation Treatment
Participants’ Recall
Capitalize Expense
Capitalize
Expense
42 32
6 18
Panel C: Recall of management’s presentation using alternative wording Treatment
Expense
Participants’ Recall Capitalized some costs as incurred
Expensed all costs as incurred
14
8
Note: Panel A reports descriptive data on participants’ assessment of the content of the analyst’s report. Participants rated the analyst’s assessment of the prospects for new product development on a nine-point scale, where 1=questionable prospects and 9=probable prospects. Panel B reports the frequency of participants’ recall of management’s presentation of software development costs for each experimental treatment. Panel C reports the frequency of participants’ recall of management’s presentation of software development costs for a supplemental experiment that only includes the expense treatment.
management’s presentation, whereas most participants in the expense treatment do not correctly recall management’s presentation. Subsequent discussion with participants suggests that those in the expense treatment may have been confused by the wording of the question. Some participants indicate that accounting standards require capitalization once certain conditions are met, but that costs subject to potential capitalization are not material. In other words, the firm capitalizes software development costs, but technological feasibility has not been achieved (i.e., the firm does not currently have costs subject to potential capitalization). Further investigation is undertaken to determine whether responses are caused by participants’ confusion (i.e., the wording of the question to assess their attention to management’s presentation is not clear) or, alternatively, participants may not attend to management’s presentations. Twenty-two
Limited Attention and Individuals’ Investment Decisions
69
students are recruited to participate in a supplemental experiment. All participants have completed at least two accounting courses and are enrolled in an upper-level, financial reporting course. In the supplemental experiment the procedures are similar to those in the main experiment, except a different question is used to assess participants’ attention to management’s presentation. Participants indicate whether, for new product development costs, each company (1) expenses all costs as incurred or (2) capitalizes some costs incurred. All participants receive materials from the same experimental cell: software development costs are expensed and the analyst’s report states that future benefits are likely. Most participants indicate that the target firm capitalizes software development costs (refer to Panel C of Table 2). In fact, the percentage is almost identical to that from the main experiment – 64% of participants in the expense treatment do not correctly recall management’s presentation of software development costs. The findings suggest that, unlike the attention paid to the analysts’ assessment of new product prospects, participants for the most part do not attend to management’s presentation of software development costs.4 Note that the lack of attention is not because participants are unaware that, unlike most research and development costs, software development costs are capitalized. Instead, they apparently recognize this but do not focus on what this particular company does. This behavior is consistent with Hirshleifer and Teoh’s (2003) point that users may accept information in the form presented rather than processing all the relevant information given. Thus, participants may have accepted the software effects on earnings assuming that management had accounted for them properly. However, such acceptance of the form without further consideration suggests a lack of attention to the fact that capitalization of software costs is meant, according to SFAS 86, to imply a higher probability of success for the projects in process. Attention to the analyst’s report indicates that it is not that participants do not care about the prospects of the company’s software projects. Instead, they pay attention to the implications of the analyst’s report even when the report and management’s presentation provide conflicting evidence. Therefore, participants behave as though the analyst’s report is a more salient source of information.5
Factors Affecting Investment Decisions Participants’ indicate how important five factors are in making their decisions to invest in the target firm on a seven-point scale, where 1=not
70
BRYAN K. CHURCH AND KIRSTEN ELY
Table 3.
Invest in the Software (Target) Firm.
Panel A: When the target firm capitalizes software development costs Factor
Content of the Analyst’s Report Future benefits likely
Analyst’s report Earnings history Industry membership Management’s outlook for new product development Accounting method for product development costs
5.13 5.38 4.17 4.54
(1.51) (1.10) (1.74) (1.86)
4.33 (1.66)
Future benefits tenuous 4.75 5.38 3.92 4.29
(1.78) (1.17) (1.72) (1.68)
3.67 (2.10)
Panel B: When the target firm expenses software development costs Factor
Content of the Analyst’s Report Future benefits likely
Analyst’s report Earnings history Industry membership Management’s outlook for new product development Accounting method for product development costs
4.84 5.28 4.16 4.60
(1.62) (1.40) (1.55) (1.41)
3.44 (1.76)
Future benefits tenuous 5.16 5.48 4.84 4.16
(1.43) (1.45) (1.55) (1.62)
3.80 (1.71)
Note: The experimental materials elicit importance ratings of various factors on a seven-point scale, where 1=not important and 7=very important. The table reports the means (standard deviations) for each factor.
important and 7=very important. The five factors include the analysts report, earnings history, the industry, management’s outlook for new product development, and the accounting treatment (i.e., management’s presentation). Descriptive data, partitioned by experimental group, are shown in Table 3. The raw data suggest that, across the four groups, participants assign the most importance to earnings history and the next most importance to the analyst’s report, with less importance assigned to the other three factors. Collapsing the data across the four cells, the importance assigned to earnings history and the analyst’s report is greater than that assigned to the other three factors at po0.05. For each group, Bonferroni pairwise tests are conducted to determine whether the assigned importance differs across the five factors.6 When costs
Limited Attention and Individuals’ Investment Decisions
71
are capitalized and the analyst’s report indicates that future benefits are likely, none of the factors are significantly different. By comparison, when costs are capitalized and the analyst’s report indicates that future benefits are tenuous, significant differences emerge. More importance is assigned to earnings history than to industry membership, management’s outlook, or the accounting treatment ( po0.02). Next, the findings are discussed for the two cells in which the target firm expenses software development costs. When the analyst’s report indicates that future benefits are tenuous, more importance is assigned to earnings history than to management’s outlook or the accounting treatment ( pp0.025). In addition, more importance is assigned to the analyst’s report than to the accounting treatment ( po0.015). The data suggest that participants are more interested in the analyst’s assessment than in the message implied by expensing software development costs. When the analyst’s report indicates that future benefits are likely, more importance is assigned to earnings history and the analyst’s report than to the accounting treatment ( po0.025). Again, the data suggest that participants’ are not particularly interested in the message implied by expensing software development costs. Participants also describe, as best as they can, how they allocate funds among the four assets. Ninety-two of 98 participants provide an open-ended response. The researchers read the responses to identify factors that are listed across participants. Various factors, along with the frequency of appearance in responses, are summarized in Table 4.7 Participants list stock price history most frequently, referring to current stock price and variability
Table 4.
Open-Ended Factors Affecting Participants Investment Decisions.
Factor Stock price history Analyst’s report Research and development expenditures Earnings history Diversification Accounting method for product development costs
Frequency 38 29 27 26 10 6
Note: Participants are asked to describe, in an open-ended question, how they made their investment decisions: 92 of 98 participants provided a response. The authors read the responses to identify common factors. The table reports the number of times that various factors are listed.
72
BRYAN K. CHURCH AND KIRSTEN ELY
in price over the past year. The analyst’s report is the next most frequently listed item, which is consistent with the high importance assigned to this factor (refer to Table 3) and with it being a more salient source of information than management’s disclosure. Participants typically state that they look carefully at the report and invest in a company if the analyst has a positive outlook. Other factors appearing frequently are research and development expenditures and earnings history (the latter item is consistent with the evidence reported in Table 3). Participants are concerned that the firms maintain a commitment to research and development, looking for expenditures to grow or at least remain constant. Lastly, few participants list the implications of management’s capitalize/expense decision as entering into their deliberations. Consistent with the statistical results, this factor does not appear to have much effect on participants’ investment decisions, especially in light of the analyst’s report.
CONCLUSION This study examines whether participants’ investment decisions are affected by management’s presentation of software development costs (capitalize versus expense) when information from a more salient alternative is available (i.e., an analyst’s report). The main finding suggests that participants pay little attention to management’s presentation and the implied message regarding the potential of in-process software projects. Instead, participants’ decisions appear to be driven by their reliance on the analyst’s report. Participants invest more when the analyst’s report suggests that cost recovery is likely as opposed to tenuous, regardless of whether management capitalizes or expenses software development costs. These results are consistent with hypotheses drawing on the limited attention literature, which predict that limited attention undercuts the intent of SFAS No. 86 to provide useful information. Despite participants indicating the importance in decision making of earnings history, which is affected differently by the accounting treatment of software development costs, they do not invest as if capitalizing versus expensing conveys useful information about the potential for new products. The evidence in this study has implications for standards that rely heavily on footnote information and, thus, for the current interest in whether the U.S. should move to a more principles-based approach to setting standards. With a focus on substance over form, principles-based standards require more managerial judgment. For investors to make intelligent, informed
Limited Attention and Individuals’ Investment Decisions
73
decisions, they need to understand what underlies management’s decision to present in a particular way. However, this type of transparency relies on details provided in footnotes and a fairly sophisticated understanding of the accounting guidelines. No matter how relevant the information from principles-based standards is, it will not be effective if investors do not pay attention to it. A fruitful area for future research is to study how accounting disclosures, and particularly footnote disclosures, can be made more salient. A first step is to identify what causes the lack of salience. Is it the processing of data required, lack of prominence of accounting disclosures, concern over assessing the impact of management’s incentives, or something else? In all likelihood, a combination of factors is at work. However, only after researchers and the business community reach an understanding of why disclosures lack salience can focus be put on ways to increase attention. The aim ultimately is to promote the effectiveness of accounting disclosures.
NOTES 1. In Ackert et al. (1996), processed information is a simple linear transformation of unprocessed information. That is, processed information is not necessarily more useful than unprocessed information. 2. Ninety-six of 98 students had completed at least one accounting course. In recruiting participants, announcements were only made in classes in which students should have previously completed at least one accounting class. Nonetheless, two students who had not taken an accounting course ended up in the sample. The results, reported subsequently, are unaffected if these two students are excluded from the data analyses. The results also are unaffected if the sophomores (n=2) or nonbusiness students (n=6) are excluded from the data analyses. 3. After the instructions are read aloud, participants are permitted to ask questions, which are answered aloud by the experimenters. Very few ask questions. To ensure that information does not vary across experimental sessions, the answers only involved restating information that is contained in the experimental materials. 4. In subsequent analyses, an ANOVA is performed similar to the one reported earlier (refer to Panel B of Table 1). The earlier ANOVA is modified to include an independent variable to measure whether participants correctly recall the accounting treatment of software development costs (correct versus incorrect). Overall inferences are unchanged and the variable to measure participants’ recall is not statistically significant (F=0.185, p=0.669). 5. As mentioned in Section 2, there are a number of reasons why management’s presentation could be less salient. Investors’ confusion over the fact that successful firms, such as Microsoft, expense rather than capitalize software costs is one such reason (although it is not known whether participants are aware of this fact because it was not included in the experimental materials), and the possible incentives of
74
BRYAN K. CHURCH AND KIRSTEN ELY
management is another – as both of these would increase processing costs, thus lowering salience. This study does not attempt to determine the specific cause of lower salience. 6. Inferences are unaffected if ranks are assigned to the five factors. 7. The frequency of responses is very similar across the different experimental groups. Therefore, the data are collapsed across the groups.
ACKNOWLEDGMENTS The authors acknowledge the financial support of the College of Management, Georgia Institute of Technology.
REFERENCES Ackert, L. F., Church, B. K., & Shehata, M. (1996). What affects individuals’ decisions to acquire forecasted information? Contemporary Accounting Research, 13, 379–399. Bloomfield, R., & Libby, R. (1996). Market reaction to differentially available information in the laboratory. Journal of Accounting Research, 34, 183–207. Bossaerts, P. (2001). Experiments with financial markets: Implications for asset pricing theory. American Economist, 45, 17–32. Caniban˜o, L., Garcı´ a-Ayuso, M., & Sa´nchez, P. (2000). Accounting for intangibles: A literature review. Journal of Accounting Literature, 19, 102–130. Elliott, W. B., Hodge, F. D., Kennedy, J. J., & Pronk, M. (2007). When are graduate business students a reasonable proxy for nonprofessional investors? The Accounting Review, 82, 139–168. Financial Accounting Standards Board. (1978). Concepts statement no. 1. Norwalk, CT: FASB. Financial Accounting Standards Board. (1985). Statement no. 86: Accounting for the costs of computer software to be sold, leased, or otherwise marketed. Norwalk, CT: FASB. Fiske, S., & Taylor, S. (1991). Social cognition (2nd ed.). New York: McGraw-Hill. Hirshleifer, D. (2001). Investor psychology and asset pricing. Journal of Finance, 56, 1533–1598. Hirshleifer, D., & Teoh, S. H. (2003). Limited attention, information disclosure, and financial reporting. Journal of Accounting and Economics, 36, 337–386. Hirst, D. E., & Hopkins, P. E. (1998). Comprehensive income reporting and analysts’ valuation judgments. Journal of Accounting Research, 36, 47–75. Hodge, F. D., Kennedy, J. J., & Maines, L. A. (2004). Does search facilitating technology improve the transparency of financial reporting? The Accounting Review, 79, 687–703. Kahneman, D., & Tversky, A. (1973). On the psychology of prediction. Psychological Review, 80, 237–251. Kruschke, J., & Johansen, M. (1999). A model of probabilistic category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25, 1083–1119. Lev, B. (2001). Intangibles: Management, measurment, and reporting. Washington, DC: The Brookings Institute. Nisbett, R., & Ross, L. (1980). Human inference: Strategies and shortcomings of social judgment. Englewood Cliffs, NJ: Prentice-Hall.
Limited Attention and Individuals’ Investment Decisions
75
Payne, J., Bettman, J., & Johnson, E. (1993). The adaptive decision-maker. Oxford: Pergamon. Securities Industry Association. (2002). Equity ownership in America. New York, NY: SIA. Slovic, P. (1972). From Shakespeare to Simon: Speculations – and some evidence about man’s ability to process information. Eugene, OR: Oregon Research Institute. Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5, 207–232.
This page intentionally left blank
UNETHICAL FINANCIAL DECISION-MAKING: PERSONAL GAIN VS. CONCERN FOR OTHERS Frank Collins, Oscar J. Holzmann, Suzanne Lowensohn and Michael K. Shaub ABSTRACT Following the Enron scandal and numerous revelations of corporate misdeeds, general criticism and skepticism over current accounting practices is unparalleled. In this atmosphere, one wonders why business leaders make decisions that later appear to be blatantly unethical, and in some cases, illegal? Is it for personal gain? Is it due to a lack of concern for others? The current study approaches these questions by examining whether self-centered tendencies, such as career self-interest, or alternately, others-centered tendencies, measured by concern for others, can predict the propensity to make improper accounting decisions. Among 87 MBA students surveyed, we found that individuals with high levels of Career Self-Interest are more likely to make questionable decisions, while those high on Concern for Others are less likely to do so. However, Career Self-Interest is uncorrelated with Concern for Others.
Advances in Accounting Behavioral Research, Volume 10, 77–100 r 2007 Published by Elsevier Ltd. ISSN: 1475-1488/doi:10.1016/S1475-1488(07)10004-1
77
78
FRANK COLLINS ET AL.
INTRODUCTION General criticism and skepticism over current accounting practices and business ethics is unparalleled. Frauds at companies such as Tyco International, Adelphia Communications, Enron, and WorldCom (now MCI) have dominated the headlines for the past several years. The accounting firm Arthur Andersen no longer exists, Martha Stewart is a convicted felon, and Bernard Ebbers, former WorldCom CEO, has been sentenced to 25 years in prison, all for financial fraud (CNN Money, 2002; Masters, 2004; MSNBC News, 2005). Despite passage of the SarbanesOxley Act in June 2002 to reduce accounting fraud, fraud within business organizations continues to be rampant (AccountingWeb Inc., 2003a, 2003b; KPMG LLP, 2006). In this atmosphere, one wonders why business leaders would make decisions that later appear to be blatantly unethical, and in some cases, illegal? Is it for personal gain? Is it due to a lack of concern for others? For example, during November 2001, just preceding Enron’s demise, top Enron executives cashed out more than a billion dollars in company stock when it was near its peak. In addition, nearly 600 employees deemed critical to Enron’s operations received more than $100 million while many longtime employees, including those who worked for energy and utility companies that Enron acquired, had their life savings wiped out (CNN.com, 2002). Given the dramatic impact of questionable ethical decision-making in accounting and financial decisions, examining individual attitudes and mind-sets that might lead to making these decisions is imperative. Consequently, survey data was gathered from 87 MBA students concerning the following research question: Does a relationship exist between the independent variables, Career Self-Interest and Concern for Others, and the dependent variable, Unethical Decision-Making? The study’s first independent variable, Career Self-Interest, is based upon wellestablished theories of motivation which suggest that individuals are motivated to perform certain actions if there is a positive outcome or personal benefit associated with taking these actions (Vroom, 1964; Kanfer, 1990). The second independent variable, Concern for Others, is drawn from Wheat’s (1991) theoretical construct of spirituality,1 and represents the extent to which decision makers demonstrate an awareness of the needs of others and active compassion for others. The results indicate that some MBA students, when making decisions in an accounting context, will be most strongly influenced to make choices that lead to their greatest personal benefit in terms of pay, promotion, or job
Unethical Financial Decision-Making
79
security, regardless of the ethical nature of these decisions. Alternately, some individuals were most strongly influenced by the second independent variable, termed ‘‘Concern for Others.’’ Thus, persons with high Career SelfInterest and low Concern for Others were more likely to commit unethical financial acts. The next section introduces prior research on ethical decision-making, the motivational impact of personal gain on decision-making, and the importance of the dimensions of human spirituality, particularly those related to concern for others. Discussions of the study’s scenario approach and research method follow the formal statement of hypotheses. The paper concludes with discussion of study results and their implications for corporate business practices and business school curricula.
THEORY DEVELOPMENT This study seeks to identify factors that influence the unethical decisionmaking behavior of individuals. Classical theories of human motivation provide potential explanations for behavioral tendencies and variations in behavior (Chung, 1977). Doctrines of hedonism and capitalism stress that individuals will choose pleasure over pain and pursue their self-interest, while Christian doctrine acknowledges good and evil, highlighting the Greek philosophers’ concept of virtue and societal good as the basis of human behavior. Some also suggest that individuals possess both hedonistic and altruistic characteristics (Jensen, 1994; Jensen & Meckling, 1994; Smith, 1976). Perhaps the central conflict of ethical decision-making is the struggle over how much others’ interests should count. Adam Smith recognizes that the ‘‘y wise and virtuous man is at all times willing that his own private interest should be sacrificed to the public interest of his own particular order or society’’ (Smith, 1976, p. 384). Virtue is defined by Smith, in part, as taking into account the interests of others. Yet Smith admits that the road to fortune and the road to virtue ‘‘y lie sometimes in very opposite directions’’ (Smith, 1976, p. 130). Ethical decision-making has been the subject of much research both in the general business and accounting disciplines to date (O’Fallon & Butterfield, 2005; Jones, Massey, & Thorne, 2003; Loe, Ferrell, & Mansfield, 2000; Louwers, Ponemon, & Radtke, 1997; Ford & Richardson, 1994; Ponemon & Gabhart, 1993). Researchers have empirically examined numerous individual characteristics, such as age (Shaub, 1994), gender (Gilligan, 1982), political orientation (Emler, Renwick, & Malone, 1983; Fisher & Sweeney,
80
FRANK COLLINS ET AL.
1998), professional and organizational commitment (Ferrell & Gresham, 1985; Shaub, Finn, & Munter, 1993), training and experience (Bebeau, 1994; Hiltebeitel & Jones, 1992), and personal values (O’Fallon & Butterfield, 2005; Nonis & Swift, 2001). Other studies have examined the effects of contextual factors, including the immediate job context (Trevino & Weaver, 2003), the cultural environment (Cohen, Pant, & Sharp, 1995, 1996), and the professional and organizational environments (Trevino, 1986; Ferrell & Gresham, 1985). Despite the plethora of research on unethical decision-making, only two studies have examined the potential conflicting elements of hedonism or selfinterest and concern for others together. Glass and Wood (1996) find both positive outcomes and altruism to be significantly related to the likelihood for students to illegally copy software, while Glover, Bumpus, Logan, and Ciesla (1997) examine achievement orientation as well as concerns for honesty, fairness, and others, but do not confirm hypothesized relationships. This study extends this work by examining the effects of this tension between self-interest and concern for others in the context of accounting and financial decision-making by MBA students. Motivational theories examine factors that underlie the choices and actions of individuals, hence this study first turns to motivational research and focuses on expectancy theory, which highlights self-centered tendencies in explaining human behavior. However, while it may be possible to predict many types of human behavior based upon self-interest, simple rationality, without the restraint of integrity, is considered inadequate for ethical decisions. Unethical decision-making may arise in situations where others’ interests ought to be taken into account. Hence others-centered tendencies, elements of spirituality, which reflect individuals’ appreciation of life and compassion for others, are also explored as potential influences in ethical decision-making.
Motivation and Career Self-Interest Cognitive choice theories of motivation attempt to determine why a person selects one alternative course of action over another.2 One stream of motivational research, termed ‘‘expectancy theory,’’ is based upon the fundamental concept that individuals will be motivated to exert effort when they believe their actions will lead to personally satisfying results. The classic expectancy model of motivation, conceptualized by Vroom (1964), presents motivation as a function of probability assessments associated with different
Unethical Financial Decision-Making
81
types of behavior (expectancy), expectations related to the receipt of outcomes based upon behavior (instrumentality), and the desirability of the outcomes associated with these behaviors (valence). The theory contends that individuals will choose to engage in actions or behaviors that maximize positive outcomes and minimize negative effects. Extensive research in the areas of preference and choice behavior within a work environment has provided strong support for expectancy-type models.3 Expectancy theory has also been widely adopted within behavioral accounting literature (Parker, Ferris, & Otley, 1989). In accounting settings, expectancy-type studies have addressed topics which include: employee turnover and job satisfaction (Dillard & Ferris, 1979; Ferris, 1977a, 1977b, 1978), budget-related performance (Kren, 1990), underreporting behavior (Lightner, Adams, & Lightner, 1982), and general motivation of accounting subjects (Awasthi & Pratt, 1990; Harrell, Caldwell, & Doty, 1985). These prior empirical accounting studies have identified outcomes that are desirable and important to accounting professionals (i.e., have a positive valence). Desirable outcomes germane to accounting professionals include pay raises, pay relative to peers, job security, promotion potential, career opportunities, professional marketability, reputation among executives and peers within the organization, social status, and privileges and perquisites (Ferris, 1977a, 1977b, 1978; Dillard & Ferris, 1979; Seiler, Collins, & Johnson, 1987).4 Using confirmatory factor analysis, Lowensohn and Collins (2001) classified many of these incentives/outcomes on ‘‘career’’ or ‘‘status’’ dimensions in their study of audit partner motivation. Incentives/outcomes are motivational. Beams, Brown, and Killough (2003) find that managerial accounting students are more likely to engage in insider trading as expected wealth gain increases, while Shapeero, Koh, and Killough (2003) conclude that accountants who expect greater rewards from underreporting are more likely to underreport chargeable time. Hoffman, Couch, and Lamont (1998) find that individuals are less likely to act in an ethical manner when their economic well-being is at stake; however, Gillett and Uddin (2005) find that CFO intention to act unethically is not affected by compensation structure. Tang and Chiu (2003) find that the ‘‘love of money’’ is significantly related to unethical behavior, termed ‘‘evil’’ in their study, for a sample of workers in Hong Kong, while Hegarty and Sims (1978) demonstrate that the availability of extrinsic rewards is the most significant indicator of unethical decision-making. Given the apparent relationship between desirable outcomes and unethical decision-making, it is plausible that individuals might make unethical accounting and financial decisions for the ‘‘payoffs’’ that these
82
FRANK COLLINS ET AL.
acts may provide them in terms of career-related extrinsic rewards. This study posits that the desire to satisfy one’s career self-interest affects one’s tendency to make unethical accounting and financial decisions, resulting in the following hypothesis:5 H1. The desire to satisfy one’s career self-interest is positively related to making unethical accounting and financial decisions. Human Spirituality and Concern for Others Most theories of ethical decision-making include a component that goes beyond a calculation of external rewards or consequences to incorporate an internal evaluation that is based on a sense of duty or a religious teaching (i.e., Shamir, 1990; Hunt & Vitell, 1986). While personal values and ideals provide mixed results in studies of ethical decision-making in business (O’Fallon & Butterfield, 2005; Nonis & Swift, 2001), Giacalone and Jurkiewicz (2003) find that spirituality is associated with individual perceptions of unethical practices. Wheat (1991) identifies a concept, termed ‘‘spirituality,’’ which considers how individuals deal with relations among themselves and other living things.6 His operational definition involves the process of thinking, feeling, and acting; it can be expressed by personal values, an inner unobservable experience, or specific behaviors. His construct is also contextual, involving: y the personal valuing, experiencing or behavioral expression of (a) a larger context or structure in which to view one’s life, (b) an awareness of and connection to life itself and other living things, and (c) a reverent compassion for the welfare of others. (p. 89)
Using this definition, Wheat (1991) developed and validated a composite measure that frames the concept of spirituality as different from mere religious commitment. While a number of important psychologists have included humankind’s spiritual nature as part of their theory of human behavior (Huitt, 2000), the accounting literature provides little insight into the influence of spirituality on unethical decision-making. In a business context, spirituality has been influential in corporate leadership initiatives, from Greenleaf ’s (1977) servant-leadership approach to Covey’s (1989) focus on the importance of leaders’ inward reflection (Wagner-Marsh & Conley, 1999). Using Wheat’s construct, Giacalone and Jurkiewicz (2003) find that spirituality negatively influences individual perceptions of unethical practices. Furthermore, some companies have found aspects of spirituality, including community and
Unethical Financial Decision-Making
83
contributing to the greater good, to be important in accomplishing their mission (Milliman, Ferguson, Trickett, & Condemi, 1999). Given Wheat’s definition of spirituality above and the related research, it follows that elements of spirituality affect individual behavior and decisionmaking. In fact, Shamir (1990) argues that individual morals may also influence work motivation, and thus, should be incorporated into expectancy models. Consequently, we posit that concern for others affects one’s tendency to make unethical accounting and financial decisions because of the effects of these decisions on others, leading to the following hypothesis: H2. One’s concern for others is negatively related to making unethical accounting and financial decisions.
METHODOLOGY Data Collection Data were collected from surveys completed by 87 MBA students at three universities located in the United States. The average respondent was 32.4 years old and 65.4% were male. MBA students are capable of making choices among accounting-related alternatives, and thus reflect the ‘‘mindset’’ of those involved in the present scandals. Furthermore, since each university’s MBA classes are made up largely of students with significant work experience, averaging over five years in their jobs, their attitudes should mirror those that exist in the work place. The survey instrument was completed on a volunteer basis in MBA accounting classes. Demographic data for survey respondents is given in Table 1.
Variables Unethical Decision Scenarios: Dependent Variable The dependent variable, Unethical Decision-Making, is the individual’s willingness to engage in questionable accounting and financial decisions that are either unethical or contrary to generally accepted accounting principles. Jones (1991, p. 367) defines an ethical decision as one that is ‘‘both legal and morally acceptable to the larger community,’’ whereas an unethical decision is ‘‘either illegal or morally unacceptable to the larger community.’’ In an accounting context, Cottell and Perlin (1990, p. 92) state, ‘‘To be moral is to
84
FRANK COLLINS ET AL.
Table 1.
Demographic Statistics for the MBA Student Subjects.
n=87 Mean age Gender Educational level (highest attained) Bachelor’s degree Some graduate Graduate Mean hierarchical level (1=lower level, 7=top level executive) Mean job tenure Area of work Accounting Marketing/sales Information systems Finance Consulting Other
32.4 years 65.4% male, 34.6% female 21.2% 57.6% 21.2% 3.73 5.34 years 16.3% 19.8% 12.8% 16.3% 17.4% 17.4%
abide by the codes or covenants adopted by one’s peers, or by their predecessors; it is to practice the standards of, say, accounting faithfully and consistently.’’ To measure the dependent variable, Unethical Decision-Making, seven decision-making scenarios were created and respondents were asked to indicate how likely they were to take a questionable or unethical action (Likert scale anchored ‘‘Not at All Likely’’ to ‘‘Very Likely’’). Five of the scenarios were based on recent corporate circumstances alleging fraud or insider trading. In these scenarios (summarized in Appendix A) respondents were asked how likely they were to perform an egregious action concerning:
Inappropriate capitalization of expenditures Converting allowance for doubtful accounts to income Evading taxes Doing wrong for the ‘‘greater good’’ Self-dealing Bonus at another’s expense Insider trading
Independent Variables The research instrument includes nine extrinsic career-related reward items measuring Career Self-Interest and sixteen items measuring Concern For
Unethical Financial Decision-Making
85
Others.7 The Career Self-Interest items come from Lowensohn and Collins’s (2001) study, which reports a coefficient a=0.81. Items for the scale are based on the statement, ‘‘I will do whatever it takes to (obtain) y [desirable reward].’’ These five-point Likert items, anchored by ‘‘Disagree’’ and ‘‘Agree,’’ measure the extent to which each person is motivated by the desirableness of extrinsic rewards. The items from which Concern for Others was derived are adapted from Wheat’s (1991) work, and are measured on a five-point Likert scale anchored by ‘‘Disagree’’ and ‘‘Agree’’. Wheat reports a coefficient a of 0.89 and establishes evidence of construct validity through three separate studies.8
Construct Validity Partial Least Squares (PLS) analysis was chosen as the method for analyzing our data and identifying the structural relationships among variables.9 Using the survey items described above, the PLS Graph Version 3.00 Build 1060 analysis model formed three latent variables consisting of eight indicator variables for Career Self-Interest, seven indicator variables for Concern for Others, and four indicators for Unethical Decision-Making. Validity measures for the variables are presented in Table 2. PLS provides a coefficient reliability index (CRI) that is superior to the coefficient a, since it is weighted by the contribution of each indicator to the latent variable. Similar to coefficient a, a CRI value of X0.70 indicates that a measure is reliable (Nunnally & Bernstein, 1994; Hair, Anderson, Tatham, & Black, 1998). Given the CRI in Table 2, our measures have sufficient reliability. Convergent validity is tested in two steps: (1) Are the factor loadings significant and >0.50 (Fornell & Larcker, 1981; Straub, Gefen, & Boudreau, 2006) and (2) are indicator t-values significant (Fornell & Larcker, 1981; Straub et al., 2006)? Using a bootstrap routine, the t-values depicted in Table 2 were computed. All t-values are significant at pp0.001. The factor loadings are reasonably strong with those for Career Self-Interest and Unethical Decision-Making X0.70. The indicator loadings for Concern for Others ranged from 0.708 to 0.605. In this regard, Fornell and Larcker (1981) and Nunnally and Bernstein (1994) suggest a 0.50 cutoff (see discussion in Straub et al., 2006).10 Another test for convergent validity is that the square root of the AVE should be greater than 0.707 (Barclay, Higgins, & Thompson, 1995; Chin, 1998). The correlation matrix presented in Table 2 shows that with the exception of Concern for Others (0.689) all
86
FRANK COLLINS ET AL.
Table 2. Variable
ER1 ER2 ER3 ER4 ER5 ER6 ER8 ER9
SPIR4 SPIR8 SPIR11 SPIR12 SPIR14 SPIR19 SPIR20
SC1 SC2 SC4 SC5
Descriptive and PLS Validity Statistics. Loading t-Statistic Mean S.D.
Textual Variable Item Career self-interest (latent variable), CRI=0.961, AVE=0.759 I will do whatever it takes to enhance my job security. I will do whatever it takes to enhance my career opportunities. I will do whatever it takes to enhance my reputation with my company’s executives. I will do whatever it takes to gain recognition from my peers. I will do whatever it takes to enhance my social status in the community. I will do whatever it takes to enhance my promotion potential. I will do whatever it takes to enhance my pay raises. I will do whatever it takes to enhance my pay relative to my peers. Concern for others (latent variable), CRI=0.849, AVE=0.447 I value the relationship between all living things. It is important to be sensitive to pain and suffering. All forms of life are valuable. I feel sad when I see someone in pain. I listen closely when people tell me their problems. I feel guilty when I don’t tell the truth. I enjoy guiding young people. Unethical decision-making (latent variable), CRI=0.847, AVE=0.583 Inappropriate capitalization. Converting allowance for doubtful accounts to income. Doing wrong for the greater good. Self-dealing.
N/A
N/A
3.29
1.10
0.808
12.11
3.32
1.17
0.918
50.90
3.51
1.37
0.889
37.11
3.37
1.26
0.908
44.49
3.20
1.24
0.749
14.37
2.94
1.23
0.880
30.12
3.52
1.31
0.938
65.34
3.30
1.25
0.854
27.19
3.14
1.33
N/A
N/A
3.97
1.02
0.605 0.706
5.04 9.04
3.75 3.97
1.01 1.04
0.609 0.718 0.614
4.95 10.32 5.56
4.03 4.18 3.86
1.21 0.88 0.95
0.795 0.682
18.94 10.59
4.16 3.84
1.07 1.04
N/A
N/A
2.01
0.82
0.826 0.808
17.49 17.20
2.20 1.70
1.11 .98
0.635 0.798
6.42 18.33
2.20 1.93
1.08 1.12
Correlations between latent variables and square roots of AVEs (shaded figures – AVE square roots) CSI Career self-interest (CSI) Concern for others (CFO) Unethical decision-making (UDM)
0.867 0.055 0.384
CFO
0.689 0.441
UDM
0.764
Unethical Financial Decision-Making
87
AVE square roots were X0.707. So, given this latter test, our indicator loadings, and significant t-values, we concluded that we have reasonably strong convergent validity for our latent constructs (though it is somewhat weaker for Concern for Others). To test for discriminant validity, loadings of the indicator variables should load more highly on the appropriate latent construct than on the other latent constructs, e.g., the extrinsic reward variables (ER1, ER2, etc.) should load more highly on Career Self-Interest than the other latent constructs (Barclay et al., 1995; Agarwal & Karahanna, 2000). A confirmatory factor analysis indicated that all indicators loaded more highly on the appropriate construct. Another test for discriminant validity is that the square root of the AVE should be higher than its correlation with other constructs (Barclay et al., 1995; Chin, 1998). The correlation matrix presented in the Table 2 section, Correlations Between Latent Variables and Square Roots of AVEs, shows that all AVE square roots were greater than their correlations with other latent variables. Consequently, we conclude that our constructs have discriminant validity. In addition, our sample size exceeds the guideline presented by Chin, Marcolin, and Newested (2003) which suggests a sample size of 10 times the number of the largest group of indicators for a latent variable (our largest indicator group=8; required sample size=80; present sample=87). Thus, eight extrinsic reward items, seven concern for others items, and four scenarios formed our latent independent and dependent variables, Career Self-Interest, Concern for Others, and Unethical Decision-Making.11 In addition to validity statistics, descriptive statistics for the components of the latent variables, Career Self-Interest and Concern for Others (independent variables) and Unethical Decision-Making (dependent variable) are provided in Table 2. All items were measured on five-point Likert scales. For the eight Career Self-Interest items, respondents reported a moderate level of motivation associated with the various extrinsic rewards (overall mean of 3.29). The means ranged from 2.94 for enhancing social status to a high of 3.52 for enhancing promotion potential. Means of the items comprising Concern for Others indicate relatively high mean values (overall mean of 3.97 on a five point scale) ranging from a low of 3.86 for listening closely when people tell me their problems to a high of 4.16 on being truthful. For our dependent variable, Unethical Decision-Making, the overall mean was a moderately low 2.01 and ranged from a low of 1.70 for Converting Allowance to a high of 2.20 for Inappropriate Capitalization and Doing Wrong for the Greater Good. Since the mean values of all the scenarios were p2.20 (on a five-point scale), there was a moderately low
88
FRANK COLLINS ET AL.
propensity to elect any of them. Yet, each scenario is based on a fairly egregious violation of generally accepted accounting principles or is in fact illegal. Thus, even the moderately low mean values reported are important.
RESULTS AND DISCUSSION Tests of Hypotheses: Discussion The structural model is presented in Fig. 1 and depicts structural paths among the independent variables, Career Self-Interest and Concern for Others, and the dependent variable, Unethical Decision-Making. As shown in Fig. 1, the model R2 is 0.324 and the path coefficients are significant at pp0.001 (t-values are shown in parentheses) and have the hypothesized sign, i.e., Career Self-Interest is positively related (standardized coefficient=0.367) to Unethical Decision-Making and Concern for Others is negatively related (standardized coefficient= 0.451). Consequently, the hypotheses are accepted. Persons with high Career Self-Interest were more likely to make unethical decisions and persons with high Concern for Others were less likely to make unethical decisions.
Career SelfInterest 0.367* (4.02)
R 2 = 0.324 Unethical DecisionMaking
-0.451 (5.53) Concern for Others
* Path loadings & t-values (in parentheses, all significant at p < 0.001)
Fig. 1.
Structural Model.
Unethical Financial Decision-Making
89
Table 3. Correlations among the Four Unethical Decision-Making Scenarios (1–4), the Independent Variables (5–6), and the Summated Dependent Variablea. 1 1 2 3 4 5 6 7
Inappropriate capitalization Converting allowance for DA to income Doing wrong for the greater good Self-dealing Career self-interest Concern for others Unethical decision-making
2
3
1.000 0.530
1.000
0.335
0.435
1.000
0.451 0.362 0.390 0.759
0.538 0.276 0.284 0.801
0.458 0.257 0.126 0.726
4
1.000 0.307 0.309 0.790
5
1.000 0.095 0.387
6
7
1.000 0.362 1.000
a
These are the correlations among the four scenarios that make up the dependent variable, unethical decision-making, the independent variables, career self-interest and concern for others, and the dependent variable. po 0.01 (one-tail significance levels).
This finding is consistent with Giacalone and Jurkiewicz (2003) who found that one’s spirituality (using Wheat’s scale) affects perceptions of whether a practice is ethical or not. The present study takes the additional step of linking Concern for Others and Career Self-Interest to MBA students’ willingness to engage in unethical accounting and financial practices. A review of the correlation coefficients presented in Table 3 is consistent with the PLS analysis. Career Self-Interest is associated with a higher likelihood of making each unethical decision, and Concern for Others is associated with a lower likelihood of Unethical Decision-Making. In addition, the higher the likelihood of making one unethical decision, the higher the likelihood was of making others, as shown by the fact that all of the decision scenarios correlate significantly and positively among themselves. However, Career Self-Interest and Concern for Others are not significantly correlated.12
Demographic Issues Demographic data were also gathered from respondents, including hierarchical level, gender, time on job, their type of work, business of their company, level of education, and professional qualifications. A full PLS model including all potential interacting or moderating variables was tested and produced only one minor moderating influence from a demographic. Thus, the study’s findings hold across the MBA student demographics.13
90
FRANK COLLINS ET AL.
Implications Given the current state of affairs, it is imperative that American business takes positive steps to reduce the number of improprieties and to raise confidence in accounting and general business operations. This study’s results indicate that Career Self-Interest and Concern for Others influenced these MBA students’ willingness to make unethical decisions. While such findings require future study and replication with actual business professionals, knowledge of these relationships may prove meaningful to businesses, professional organizations, and academics in business disciplines. Kamberg (2001) notes that managers must practice ‘‘moral management’’ by encouraging core corporate values such as honesty, integrity, customer service, and community involvement. While a number of companies claim to have overhauled their corporate codes of ethics in the wake of the recent scandals (see Schlank, 2002; AccountingWeb, 2003c; Mehta, 2003; Riordan & Riordan, 2003; Zikmund, 2003), whether such efforts will curb unethical actions remains to be seen. In fact, corporations may wish to monitor employee attitudes regarding Career Self-Interest and Concern for Others, particularly those in positions where unethical behavior exposes the organization to significant risks.14 Furthermore, organizations should carefully consider their employee reward structure to minimize the opportunity for individuals to receive personal benefit by making unethical decisions. Individual businesses and professional organizations, such as the AICPA and IMA, should be encouraged to offer seminars or other training activities to make accountants aware of the positive influence that Concern for Others can have in reducing the propensity to engage in unethical behavior. Encouraging this type of Concern for Others is not only in the interest of society, but also in the company’s interest. For example, a study by Walker Information, a global stakeholder research firm, reports evidence that relates ethical corporate behavior, or ‘‘good corporate citizenship,’’ to positive business outcomes (Verschoor, 2001). Finally, research in this area may aid in curriculum development. Most accounting programs have courses or parts of courses devoted to ‘‘ethics,’’ but these tend to focus on professional organizations’ ethical standards (i.e., IMA, AICPA, etc.) and do not look to individual dispositions and attitudes (such as Concern for Others) that should be part of a decisionmaker’s value set. Students may benefit from a greater focus on protecting the public interest in these courses, which is a natural way to communicate
Unethical Financial Decision-Making
91
the importance of Concern for Others. Also, the study results provide additional content guidance for the CMA and CPA exams. Indeed, the IMA Ethics Center could include content similar to this study’s scenarios in a case study or other materials. In summary, both Career Self-Interest and Concern for Others are important to ethical decision-making in business. Simple reliance on selfrestraint without giving attention to incentives affecting self-interest is naı¨ ve. However, structuring rewards consistent with agency theory is not the only way to affect management’s ethical decision-making. The accounting profession is responsible for understanding how to effectively combine these factors to create an atmosphere conducive to ethical behavior in business.
Suggestions for Further Research The present study has established that the hypothesized relationships between Career Self-Interest and Concern for Others with Unethical Decision-Making exist for the sample of MBA students surveyed. Largely, these suggestions are preliminary, and more work is needed to evaluate their effectiveness and to determine whether individual differences on the two independent variables can be changed through training or other corrective measures. Future research should also examine whether these findings hold in corporate settings as well as among MBAs. Additional research that examines the impact of MBA programs on Career Self-Interest and Concern for Others seems warranted as well. Do programs with a focus on others through an emphasis on corporate responsibility and ethics produce different types of professionals from traditional MBA programs? Will an MBA’s preparation produce differing results in corporations when it comes to ethical choices in financial decisions? In a business environment coping with the results of executives’ apparent unrestrained pursuit of self-interest, these questions warrant further examination.
NOTES 1. As used here and later explained, ‘‘spirituality’’ is not a religious concept, but considers how individuals deal with relations among themselves and other living things. 2. See Kanfer (1990) for a review of this literature.
92
FRANK COLLINS ET AL.
3. For example, see Vroom (1966); Sheard (1970); Mitchell and Knudsen (1973); Wanous, Keon, and Latack (1983); Allred, Mallozzi, Matsui, and Raia (1997); Bozeman and Kacmar (1997); Schaefers, Epperson, and Nauta (1997); and Chen and Hoshower (1998). 4. These positive outcomes are considered ‘‘extrinsic,’’ since they represent externallygenerated consequences, as opposed to ‘‘intrinsic’’ outcomes, which are positive based upon personal, internal appeal to the individual (Gailbraith & Cummings, 1967). 5. While a concept such as Machiavellianism could have been chosen here, rather than Career Self-Interest, interest was in gaining insight into the impact of internal incentives that are common to many accountants and that might create conflicts in their ethical decision-making. Machiavellianism, for example, carries a much more sinister context, implying the willingness to harm and use others deliberately for one’s own benefit. A professional can be focused on career self-interest without deliberately ‘‘doing in’’ other people in a Machiavellian sense. 6. Spirituality as defined here and in the literature (Wheat, 1991; Young, Cashwell, & Woolington, 1998, p. 64) is not a religious condition, but merely considers how individuals deal with relations among themselves and other living things. ‘‘Religion’’ is concerned with one’s relationships with a deity[ies] and to be religious is ‘‘to be maintaining faithful devotion to y a deity (Webster, 2002).’’ Hence, while one might associate spirituality with religion, individuals may be spiritual without being religious or vice versa. Arguably, most ‘‘religious’’ persons are spiritual (i.e., have a high concern for others), yet it is possible to be religious without a high concern for others. For example, witness the recent terror bombings in the name of a supreme deity as detailed by Malkin (2004) in ‘‘Let us remember correctly terror ‘in the name of Allah’.’’ In the same fashion, it is possible to have a high concern for others while professing to be an atheist. 7. Two status items from Lowensohn and Collins (2001), as well as two additional reward items not directly related to career or status, were developed by the authors and included in the survey. They were not retained in the final analysis. 8. The scale was also validated in Belaire and Young (2000). 9. Since the hypotheses specified in this study seek to identity the structural relationships among variables, two techniques suggest themselves, Structural Equations Modeling (SEM) and Partial Least Squares (PLS) analysis. Though both of these techniques provide a simultaneous modeling of structural paths and account for measurement error among variables, PLS is more appropriate for exploratory research and is more tolerant of small sample sizes – two characteristics of the present study (Wold, 1966; Gefen, Straub, & Boudreau, 2000; Chin et al., 2003). 10. Chin (1998) noted that the 0.70 cut off that is commonly used in the IS literature should not be rigidly applied. 11. The PLS analysis indicates that scenario three, four, and six, Evading Taxes on Artwork, Bonus at Another’s Expense, and Insider Trading, should be omitted. This is logical and theoretically defensible because each of the remaining scenarios describes issues involving job-related actions (i.e., making an inappropriate journal entry to classify an expense as a capital item), while two of those not included in the latent variable involved immediate self-aggrandizement (Evading Taxes and Bonus at Another’s Expense) and the bonus scenario involves co-workers, and thus introduces the unique dimension of peer relationships. Indeed, prior research has
Unethical Financial Decision-Making
93
shown that individuals dislike the effects of unethical acts on peers (Carlson, Kacmar, & Wadsworth, 2002). 12. This is consistent with Jensen (1994), who notes that self-interest does not preclude an individual from possessing altruistic tendencies. Jensen and Meckling (1994) present a model of individual behavior termed the Resourceful Evaluative Maximizing Model (REMM). Individuals are viewed as adaptable maximizers with unlimited wants; however, they face various constraints in satisfying their desires. Altruistic tendencies would presumably represent a constraint in an individual’s utility maximization scheme, forcing individuals to compare the consequences of alternative courses of action in the presence of the constraint. This also reinforces the distinction emphasized in the theory section of this paper that Career Self-Interest is distinct from Machiavellianism, since the results provide evidence of individuals possessing both a high concern for others and high career self-interest. 13. The PLS analysis was conducted to see if demographics moderated the relationships between the independent variables and our dependent variable. The approach used was presented by Chin et al. (2003) whereby separate latent variables are created for the moderator effect by ‘‘multiplying the indicators from the predictor and the moderator variables (p. 198)’’ to form these separate latent variables. This analysis showed that tenure modified the effect of Career Self-Interest on Unethical Decision-Making (standardized coefficient, 0.131). 14. While a solution such as this may be controversial to some, new frontiers must be explored to combat unethical behavior in business settings. Business professionals could be advised to exercise professional skepticism – e.g., if you sense low ‘‘Concern for Others,’’ be on guard. An employee cannot be fired for low Concern for Others, but managers can certainly monitor employees as they deem appropriate, as long as civil rights are not impaired.
REFERENCES AccountingWeb, Inc. (2003a). Study: Audit firms still bend rules to benefit big clients. www.accountingweb.com (October 28). AccountingWeb, Inc. (2003b). Survey reveals clients deeply dissatisfied with professional service providers. www.accountingweb.com (November 20). AccountingWeb, Inc. (2003c). Scandal-ridden Tyco gets serious about ethics. www.accounting web.com (November 12). Agarwal, R., & Karahanna, E. (2000). Time flies when you’re having fun: Cognitive absorption and beliefs about information technology usage. MIS Quarterly, 24, 665–694. Allred, K., Mallozzi, J. S., Matsui, F., & Raia, C. P. (1997). The influence of anger and compassion on negotiation performance. Organizational Behavior and Human Decision Processes, 70(June), 175–187. Awasthi, V., & Pratt, J. (1990). The effects of monetary incentives on effort and decision performance: The role of cognitive characteristics. The Accounting Review, 65(October), 797–811. Barclay, D., Higgins, C., & Thompson, R. (1995). The partial least squares approach to causal modeling: Personal computer adoption and use as an illustration. Technology Studies, 2, 285–309.
94
FRANK COLLINS ET AL.
Barrett, D. (2003, July 8). Judge: Exceeding $750 million fine against WorldCom would hurt employees. http://www.crn.com/sections/breakingnews/dailyarchives.jhtml?articleId =18823187 Beams, J. D., Brown, R. M., & Killough, L. N. (2003). An experiment testing the determinants of non-compliance with insider trading laws. Journal of Business Ethics, 45, 309–323. Bebeau, M. (1994). Influencing the moral dimensions of dental practice. In: J. Rest & D. Narvaez (Eds), Moral development in the professions (pp. 121–146). Hillsdale, NJ: Lawrence Erlbaum Associates. Belaire, C., & Young, J. S. (2000). Influences of spirituality on counselor selection. Counseling and Values, 44(3), 189–197. Bozeman, D. P., & Kacmar, K. M. (1997). A cybernetic model of impression management processes in organizations. Organizational Behavior and Human Decision Processes, 69(1), 9–30. Carlson, D. S., Kacmar, K. M., & Wadsworth, L. L. (2002). The impact of moral intensity dimensions on ethical decision making: Assessing the relevance of orientation. Journal of Managerial Issues, 14(Spring), 15–30. Chen, Y., & Hoshower, L. B. (1998). Assessing student motivation to participate in teaching evaluations: An application in expectancy theory. Issues in Accounting Education, 13(3), 531–549. Chin, W. W. (1998). The partial least squares approach to structural equation modeling. In: G. A. Marcoulides (Ed.), Modern methods for business research (pp. 295–336). Mahwah, NJ: Lawrence Erlbaum Associates. Chin, W. W., Marcolin, B. L., & Newested, P. R. (2003). A partial least squares modeling approach for measuring interaction effects: Results from a Monte Carlo simulation study and an electronic-mail emotion/adoption study. Information Systems Research, 14(2, June), 189–217. Chung, K. H. (1977). Motivational theories and practices. Columbus, OH: Grid, Inc. CNN.com. (2002). Explaining the Enron bankruptcy. (January 13, posted: 5:14 p.m.). CNN Money. (2002). Andersen sentenced: Accounting firm hit with $500,000 fine, five years’ probation for Enron conviction. http://money.cnn.com/2002/10/16/news/companies/ andersen/ (viewed October 16, 2002). Cohen, J. R., Pant, L. W., & Sharp, D. J. (1995). An exploratory examination of international differences in auditors’ ethical perceptions. Behavioral Research in Accounting, 7, 37–64. Cohen, J. R., Pant, L. W., & Sharp, D. J. (1996). Measuring the ethical awareness and ethical orientation of Canadian auditors. Behavioral Research in Accounting, 8(Supp), 98–119. Cottell, P. G., Jr., & Perlin, T. M. (1990). Accounting ethics: A practical guide for professionals. Westport, CT: Quorum Books. Covey, S. (1989). The 7 habits of highly effective people. London: Simon & Schuster. Dillard, J. F., & Ferris, K. R. (1979). Sources of professional staff turnover in public accounting firms: Some further evidence. Accounting, Organizations and Society, 4(3), 179–186. Drucker, J., & Sender, H. (2002). WorldCom debacle shows how easy fraud can be. Wall Street Journal On-line (June 27). Emler, N., Renwick, S., & Malone, B. (1983). The relationship between moral reasoning and political orientation. Journal of Personality and Social Psychology, 45(5), 1073–1080. Farrell, G., & Horovitz, B. (2002). Stock troubles threaten Stewart’s image. USAToday.com (posted June 25).
Unethical Financial Decision-Making
95
Ferrell, O. C., & Gresham, L. G. (1985). A contingency framework for understanding ethical decision making in marketing. Journal of Marketing, 49(Summer), 87–96. Ferris, K. R. (1977a). A test of the expectancy theory of motivation in the accounting environment. The Accounting Review, 52(July), 605–615. Ferris, K. R. (1977b). Perceived uncertainty and job satisfaction in the accounting environment. Accounting, Organizations and Society, 2(1), 23–28. Ferris, K. R. (1978). Perceived environmental uncertainty as a mediator of expectancy theory predictions: Some preliminary findings. Decision Sciences, 9, 379–390. Fisher, D., & Sweeney, J. (1998). The relationship between political attitudes and moral judgment: Examining the validity of the defining issues test. Journal of Business Ethics, 17(8), 905–916. Ford, R. C., & Richardson, W. D. (1994). Ethical decision making: A review of the empirical literature. Journal of Business Ethics, 13, 205–221. Fornell, C. R., & Larcker, D. F. (1981). Structural equation models with unobservable variables and measurement error. Journal of Marketing Research, 18(February), 39–50. Gailbraith, J., & Cummings, L. L. (1967). An empirical investigation of the motivational determinants of task performance: Interactive effects between instrumentality-valence and motivation-ability. Organizational Behavior and Human Performance, 2, 237–257. Gefen, D., Straub, D. W., & Boudreau, M.-C. (2000). Structural equation modeling and regression: Guidelines for research practice. Communications of the Association for Information Systems, 4(7). Giacalone, R. A., & Jurkiewicz, C. L. (2003). Right from wrong: The influence of spirituality on perceptions of unethical business activities. Journal of Business Ethics, 46, 85–97. Gillett, P. R., & Uddin, N. (2005). CFO intentions of fraudulent financial reporting. Auditing: A Journal of Practice and Theory, 24(1), 55–75. Gilligan, C. (1982). In a different voice. Cambridge, MA: Harvard University Press. Glass, R. S., & Wood, W. A. (1996). Situational determinants of software piracy: An equity theory perspective. Journal of Business Ethics, 15(11), 1189–1198. Glover, S. H., Bumpus, M. A., Logan, J. E., & Ciesla, J. R. (1997). Re-examining the influence of individual values on ethical decision making. Journal of Business Ethics, 16, 1319–1329. Greenleaf, R. (1977). Servant leadership: A journey into the nature of legitimate power and greatness. New York: Paulist Press. Hair, J. F., Anderson, R. E., Tatham, R. L., & Black, W. C. (1998). Mutivariate data analysis with readings. New York: Macmillan. Harrell, A., Caldwell, C., & Doty, E. (1985). Within-person expectancy theory predictions of accounting students’ motivation to achieve academic success. The Accounting Review, 60(October), 724–735. Hegarty, W. H., & Sims, H. P., Jr. (1978). Some determinants of unethical decision behavior: An experiment. Journal of Applied Psychology, 63(4), 451–457. Hiltebeitel, K., & Jones, S. (1992). An assessment of ethics instruction in accounting education. Journal of Business Ethics, 11, 37–46. Hoffman, J. J., Couch, G., & Lamont, B. T. (1998). The effect of firm profit versus personal economic well being on the level of ethical responses given by managers. Journal of Business Ethics, 17, 239–244. Huitt, W. (2000). The spiritual nature of a human being. http://chiron.valdosta.edu/whuitt/col/ spiritual/spirit.html
96
FRANK COLLINS ET AL.
Hunt, S. D., & Vitell, S. (1986). A general theory of marketing ethics. Journal of Macromarketing, 6, 5–16. Jensen, M. C. (1994). Self interest, altruism, incentives, and agency theory. Journal of Applied Corporate Finance, 7(2), 40–45. Jensen, M. C., & Meckling, W. H. (1994). The nature of man. Journal of Applied Corporate Finance, 7(2), 4–19. Jones, T. M. (1991). Ethical decision making by individuals in organizations: An issuecontingent model. Academy of Management Review, 16(2), 366–395. Jones, J., Massey, D. W., & Thorne, L. (2003). Auditors’ ethical reasoning: Insights from past research and implications for the future. Journal of Accounting Literature, 22, 45–103. Kamberg, M. (2001). Making ethical business decisions. Women in Business, 53(2), 22–25. Kanfer, R. (1990). Motivation theory and industrial and organizational psychology. In: M. D. Dunnette & L. M. Hough (Eds), Handbook of industrial and organizational psychology (2nd ed.). Palo Alto, CA: Consulting Psychologists Press, Inc. KPMG LLP. (2006). Forensic integrity survey 2005–2006. Available at http://www.us.kpmg. com/news/index.asp?cid=2051 (viewed May 22, 2006). Kren, L. (1990). Performance in a budget-based control system: An extended expectancy model approach. Journal of Management Accounting Research, 2(1), 100–112. Lightner, S. M., Adams, S. J., & Lightner, K. M. (1982). The influence of situational, ethical, and expectancy theory variables on accountants’ underreporting behavior. Auditing: A Journal of Practice and Theory, 2(Fall), 1–12. Loe, T. W., Ferrell, L., & Mansfield, P. (2000). A review of empirical studies assessing ethical decision making in business. Journal of Business Ethics, 2(3), 185–204. Louwers, T., Ponemon, L., & Radtke, R. (1997). Examining accountants’ ethical behavior: A review and implications for future research. In: V. Arnold & S. G. Sutton (Eds), Behavioral accounting research: Foundations and frontiers (pp. 48–88). Sarasota, FL: American Accounting Association. Lowensohn, S., & Collins, F. (2001). The role and perceptions of independent audit partners in the governmental audit market. Accounting and the Public Interest, 1(1), 17–41. Malkin, M. (2004, September 12). Let us remember correctly terror ‘in the name of Allah’. Houston Chronicle. http://www.freerepublic.com/focus/f-news/1215356/posts Masters, B. A. (2004, July 17). Martha Stewart sentenced to prison: Punishment postponed as she appeals verdict. http://www.washingtonpost.com/wp-dyn/articles/A56199-2004Jul16.html Mehta, S. N. (2003). MCI: Is being good good enough? Fortune, October 27, 117–124. Milliman, J., Ferguson, F., Trickett, D., & Condemi, B. (1999). Spirit and community at southwest airlines: An investigation of a spiritual values based model. Journal of Organizational Change Management, 12(3), 221–233. Mitchell, T. R., & Knudsen, B. W. (1973). Instrumentality theory predictions of students’ attitudes towards business and their choice of business as an occupation. Academy of Management Journal, 16(1), 41–52. MSNBC News. (2005, July). Ebbers sentenced to 25 years in prison: Ex-WorldCom CEO guilty of directing biggest accounting fraud. MSNBC News. http://www.msnbc.msn.com/id/ 8474930/ Nonis, S., & Swift, C. O. (2001). Personal value profiles and ethical business decisions. Journal of Education for Business, 76 (5), 251–256. Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill.
Unethical Financial Decision-Making
97
O’Fallon, M. J., & Butterfield, K. D. (2005). A review of the empirical ethical decision-making literature: 1996–2003. Journal of Business Ethics, 59, 375–413. Parker, L. D., Ferris, K., & Otley, D. T. (1989). Accounting for the human factor. Sydney, Australia: Prentice Hall. Ponemon, L., & Gabhart, D. (1993). Ethical reasoning in accounting and auditing. Vancouver: CGA-Canada Research Foundation. Riordan, D. A., & Riordan, M. P. (2003). Who has your numbers? Strategic Finance, 84(10, April), 22–26. Schaefers, K. G., Epperson, K. L., & Nauta, M. M. (1997). Women’s career development: Can theoretically derived variables predict persistence in engineering majors? Journal of Counseling Psychology, 44(2), 173–183. Schlank, R. (2002). Deloitte responds to growing risk of unethical behavior. AccountingWeb, Inc. www.accountingweb.com (August 19). Seiler, R. B., Collins, F., & Johnson, G. (1987). Side bets and turnover behavior in the public accounting profession: A refinement of the Mobley model. Proceedings of the American Accounting Association Annual Meeting, 98, 15. Sender, H. (2002, August 21). Accounting issues at WorldCom speak volumes about disclosures. Wall Street Journal. Shamir, B. (1990). Calculations, values, and identities: The sources of collectivistic work motivation. Human Relations, 43(4), 313–332. Shapeero, M., Koh, H. C., & Killough, L. N. (2003). Underreporting and premature sign-off in public accounting. Managerial Auditing Journal, 18(6/7), 478–489. Shaub, M. K. (1994). An analysis of the association of traditional demographic variables with the moral reasoning of auditing students and auditors. Journal of Accounting Education, 12(1), 1–26. Shaub, M. K., Finn, D. W., & Munter, P. (1993). The effects of auditors’ ethical orientation on commitment and ethical sensitivity. Behavioral Research in Accounting, 5, 145–169. Sheard, J. L. (1970). Intrasubject prediction of preferences for organization types. Journal of Applied Psychology, 34(3), 248–252. Smith, A. (1976). The theory of moral sentiments. Indianapolis, IN: Liberty Classics. Straub, D., Gefen, D., & Boudreau, M.-C. (2006). Quantitative, positivist research methods in information systems. Quantitative Research in Information Systems. 6. http://dstraub. cis.gsu.edu:88/quant/6issues.asp, accessed May 24, 2006. Tang, T. L., & Chiu, R. K. (2003). Income, money ethic, pay satisfaction, commitment, and unethical behavior: Is the love of money the root of evil for Hong Kong employees? Journal of Business Ethics, 46, 13–30. Trevino, L. K. (1986). Ethical decision making in organizations: A person-situation integrationist model. Academy of Management Review, 3, 601–617. Trevino, L. K., & Weaver, G. (2003). Managing ethics in business organizations: Social science perspectives. Stanford, Palo Alto, CA: Stanford Business Books. Verschoor, C. C. (Ed.). (2001). Ethical behavior brings tangible benefits to organizations. Strategic Finance, 82(May), 20. Vroom, V. H. (1964). Work and motivation. New York: Wiley. Vroom, V. H. (1966). Organizational choice: A study of pre- and postdecision processes. Organizational Behavior and Human Performance, 1, 212–225. Wagner-Marsh, F., & Conley, J. (1999). The fourth wave: The spiritually-based firm. Journal of Organizational Change Management, 12(4), 292–301.
98
FRANK COLLINS ET AL.
Wanous, J. P., Keon, T. L., & Latack, J. C. (1983). Expectancy theory and occupational/ organizational choices: A review and test. Organizational Behavior and Human Performance, 32, 66–86. Webster’s New Collegiate Dictionary. (2002). Springfield, MA: G & C Merriam Company. Wheat, L. (1991). Development of a scale for the measurement of human spirituality. Dissertation Abstracts International, 9205143. Wold, H. (1966). Estimation of principal components and related models by interactive least squares. In: P. R. Krishnaiaah (Ed.), Multivariate analysis. New York: Academic Press. Young, J. S., Cashwell, C., & Woolington, V. (1998). The relationship of spirituality to cognitive and moral development and purpose in life: An exploratory investigation. Counseling and Values, 43(1), 63–69. Zikmund, P. (2003). Ferreting out fraud. Strategic Finance, 84(April), 28–32.
APPENDIX A. SUMMARY OF ETHICAL DECISION-MAKING SCENARIOS Scenario 1 ‘‘Inappropriate Capitalization’’ is based on accounting difficulties at WorldCom. WorldCom transferred more than $3.8 billion in expenses to capital accounts over five quarters (Drucker & Sender, 2002), and has agreed to a $750 million fine for ‘‘falsifying ledgers to record billions of dollars in operating expenses as capital expenses, allowing the company to claim a profit when it was in fact losing money (Barrett, 2003).’’ Based on the WorldCom situation, respondents were asked how likely they were to capitalize a material amount of advertising costs rather than properly treating them as expenses.
Scenario 2 ‘‘Converting Allowance for Doubtful Accounts to Income’’ also involves WorldCom and its manipulation of the allowance for doubtful accounts as a means of inappropriately managing income. WorldCom inflated accounts receivable by amounts that had been deemed uncollectible, resulting in a $2.6 billion pre-tax increase in net income (Sender, 2002). In our survey, respondents were asked how likely they would be to make a journal entry transferring a significant amount of allowance for doubtful accounts to ‘‘Other Income,’’ an entry with no justification in fact.
Unethical Financial Decision-Making
99
Scenario 3 ‘‘Evading Taxes on Artwork’’ involves Dennis Kozlowski, former CEO of Tyco, Inc., who allegedly requested that an art consultant remove a $425,000 painting from his luxury apartment and ship it to Tyco offices, where an employee allegedly signed for it and immediately reshipped it to the Kozlowski apartment. The action was orchestrated to avoid New York state taxes on the transaction. Consequently, we created a scenario involving Native American art that was similarly being ‘‘shipped’’ to corporate offices to evade taxes.
Scenario 4 ‘‘Doing Wrong for the Greater Good’’ evolved from our use of the spirituality variable. Here, it seems likely that overly zealous individuals would break rules they feel are inappropriate if the end-result offers a significant ‘‘worthy’’ benefit, i.e., the end justifies the means. This argument is often used to discourage whistleblowing, since revealing unethical behavior may lead to corporate failure, layoffs, and potentially dire consequences for the community. Consequently, we constructed a scenario where respondents were asked how likely they would be to manipulate program costs for handicapped employees to insure continuation of a program designed to provide the handicapped with employment.
Scenario 5 ‘‘Self-Dealing’’ is based on the alleged behavior of Andrew Fastow, former CFO of Enron, and involves self-aggrandizement. Fastow created numerous special entities, usually partnerships, whose main purpose was to transfer debt from Enron’s balance sheet. While Fastow controlled some of these entities, he went a step further and had a personal investment stake in others. Thus, he was in fact dealing with himself. Consequently, we developed the fifth scenario where respondents were asked how likely they were to allow their employer’s firm utilize a vendor in which they had a personal interest, even though other vendors would be more advantageous for their employer.
100
FRANK COLLINS ET AL.
Scenario 6 ‘‘Bonus at Another’s Expense’’ also involves self-aggrandizement and is not based on a specific business entity. Rather, while the others have involved improper treatment of customers or employers, we wanted to add the dimension of improper treatment of co-workers. This seems appropriate in our modern ‘‘me-first’’ culture and the generally diminished status of loyalty that is believed to exist in the workplace. Consequently, we developed a scenario where our respondents were asked how likely they would be to keep (in secret) a bonus they had received that rightfully belonged to a co-worker. Scenario 7 ‘‘Insider Trading’’ is based on the recent indictment of Martha Stewart, founder and former president of Martha Stewart Omnimedia, Inc. She allegedly received insider trading information from the now convicted former CEO of ImClone, Inc, Sam Waksal. It is alleged that when Waksal learned that an important cancer-fighting drug developed by ImClone would not be approved for use by the FDA, he called Stewart before the information was made public. Stewart then allegedly sold her investment in ImClone stock based on this ‘‘insider’’ information (Farrell & Horovitz, 2002). Consequently, we created a scenario involving insider trading where our respondents were asked how likely they were to use insider information to avoid a serious loss they would face on a personal investment.
EFFECTS OF INFORMATION LOAD AND COGNITIVE STYLE ON INFORMATION SEARCH STRATEGIES Charles F. Kelliher and Lois S. Mahoney ABSTRACT The study investigates how changes in the information load and the cognitive characteristics of the decision maker affect information processing behavior. Two Jungian personality dimensions of perception type and dominant personality style were isolated for studying how individual cognitive differences influence search behavior. The results show that information processing behavior is contingent on the demands of the task (information load), the decision maker’s dominant (and most developed) personality style, the decision maker’s preferred type of perceiving (either sensing or intuition), and the interaction between the task (information load) and the personality dimensions. These results could improve the quality of decisions by increasing individual’s self-insight and awareness of other types of search behaviors. Furthermore, management information systems, accounting reports, decision aids, and descriptive decision models may need to consider both the task environment and the cognitive characteristics of the decision maker to be beneficial.
Advances in Accounting Behavioral Research, Volume 10, 101–126 Copyright r 2007 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1016/S1475-1488(07)10005-3
101
102
CHARLES F. KELLIHER AND LOIS S. MAHONEY
INTRODUCTION In today’s environment, information is considered essential in determining the business’s financial condition, in executing business transactions, and in making business decisions (Hollander, Denna, & Cherrington, 2000). Critical to the success of every accountant is the ability to gather, produce, and interpret financial and nonfinancial information as they interface with each and the ability to link data, knowledge, and wisdom to provide quality advice (AICPA, 2000). As the use of technology now allows for the collection and maintenance of more and more information, it is increasingly important to understand how accountants search, process, and make timely decisions with this increase in information. Research on predecisional behavior is concerned with how people search, code, weigh, and combine information to make decisions. Though prior research has shown that information search strategies are a function of information load, many of these studies have not taken into account the cognitive style (personality style and perception type) of the decision maker. Accounting researchers suggested that personal characteristics, as well as information load, systematically influence information processing behavior (Ho & Rodgers, 1993; Chenhall & Morris, 1991; Casey, 1980; Mock & Vasarhelyi, 1984). Thus, both the decision task and the decision maker should be considered to fully understand individual search strategy and decision-making. This study extends prior research by examining the effects of cognitive characteristics and their interaction with information load on the information processing behavior employed by decision makers within a predecisional behavior framework. This study examines the impact of cognitive characteristics and information load on traditional measures of information search – the proportion of information search, the variability of information search, processing time, and search strategies (Payne, 1976). Specifically, the following research questions are addressed: How do the individual cognitive characteristics of personality style and perception type affect information processing behavior? How does the interaction between increases in the supply of information and the individual cognitive characteristics of personality style and perception type affect information processing behavior? The results show that information processing behavior is not only affected by the individual’s cognitive characteristics but is also affected by the interaction of these cognitive characteristics with information load.
Effects of Information Load and Cognitive Style
103
Specifically, the proportion of information search is significantly related to dominant personality style, as decision makers with a dominant perception personality style search more information than decision makers with a dominant judgment personality style. Furthermore, individuals with dominant perception personality style adjust their search patterns more as information load increases. Finally, sensing individuals with a dominant judgment personality style search a higher variability of information and take longer to process this information. Besides providing a better understanding of the underlying decision process, this study helps explain individual differences. While much of the early research examined various decision tasks including consumer buying, financial, auditing, and capital budgeting, this study is one of the first to examine a management decision involving a performance evaluation setting using internal accounting information where the potential for information overload may be greatest (see Swain & Haka, 2000; Cook, 1993; Payne, Bettman, & Johnson, 1993; Chewning & Harrell, 1990; Ford, Schmitt, Schlechtman, Hults, & Doherty, 1989; Biggs, Bedard, Gaber, & Linsmeier, 1985). These results can assist accountants with designing systems based on user characteristics and user inputs (changing the information set available to the decision maker) or by improving the decision maker’s ability to process information (increasing their self-insight and knowledge of the characteristics and limitations of their decision making processes). Additionally in systems design where it is important to consider all present and potential future information needs, decision makers with a dominant perception personality style should be assigned to the task as they tend to gather more information and consider all alternatives to ensure that they do not miss important information that may be critical to success of the systems design. In addition, as decision maker’s cognitive abilities affect their search process and how they deal with the world around them, then different accountants may be assigned to different business tasks depending on the required objectives. If timeliness is important to a task, a sensing type accountant with a dominant perception personality style should be assigned to perform a low load task. If accountants are interested in improving thoroughness and efficiency of decisions, then designers of information systems, accounting reports, decision aids, and descriptive models may need to consider both task characteristics, individual cognitive characteristics, and more importantly, their interaction. The remainder of the paper is organized as follows. The following section reviews the cognitive style, information overload, and information search behavior and provides the theoretical foundation for the development of the
104
CHARLES F. KELLIHER AND LOIS S. MAHONEY
research hypotheses. The next section describes the research design used to test the hypotheses followed by the results of the statistical tests and discussion of the findings. The final section presents conclusions and provides some suggestions for future research.
BACKGROUND AND HYPOTHESES DEVELOPMENT The way that an individual perceives information and the method used to process information is referred to as cognitive style. Jung’s theory of psychological type (Jung, 1923) asserts that an individual uses four mental processes – sensing, intuition, thinking, and feeling – to solve problems and make decisions. Sensing and intuition are opposite ways of perceiving or different ways that an individual can acquire information about a problem or situation. Thinking and feeling are opposite ways of deciding or different ways than an individual can process information to make a judgment. Jung (1923) asserts that individual behavior can be traced to differences in personality types. The Myers-Briggs type indicator (MBTI) is a forced choice, self-reported inventory that was developed to classify individuals along the cognitive style dimensions adapted from Jung’s theory of psychological type (Jung, 1923). An individual’s perception type, their preference with regard to information acquisition, reflects the way they perceive or become aware of things. Individuals can be classified into two perception types: (1) sensing types who use their five senses to observe and gather information, or (2) intuitive types who obtain information through indirect and sometime unconscious perceptions. Additionally, individuals are classified on judgment type, the way they process the information that they acquire. They can be classified as thinking types (i.e., those who look for cause and effect relationships) or feeling types (i.e., those who are concerned about the human aspect of a decision). Though everyone uses all four mental processes in varying degrees, they are not equally preferred or developed. Usually a person has a dominant personality type or a predisposition to use and develop either their perception type or judgment type more than the other. It affects what they perceive and how they judge (Myers, 1962). The dominant personality style provides overall direction and a consistent focus to the personality and decision-making processes. Beyond these basic mental processes, Jung’s theory uses two preferred attitudes toward the world – introversion or extraversion. This dimension describes an individual’s preferred mode of interacting with their environment.
Effects of Information Load and Cognitive Style
105
Information search behavior studies have reported that the proportion of information searched, the variability in the proportion search, the time it takes to search, and the information search strategies are affected by information load (Swain & Haka, 2000; Cook & Swain, 1993; Chewning & Harrell, 1990; Biggs et al., 1985). As only the individual’s cognitive style of perception type (information acquisition) and dominant personality style affect individual’s information search behavior, this study will concentrate on these dimensions of MBTI. Proportion of Information Searched According to cognitive literature, the cognitive styles of perception type (sensing vs. intuition) and dominant personality style (perception or judgment) should be related to the amount of information that an individual gathers. People that prefer sensing over intuition to learn about the problem or situation focus their attention on objective facts and details. Sensing types prefer to gather, remember, and work with many facts and are more comfortable using their physical senses to amass data about the situation or problem. Sensors focus on actualities and consider information that is explicitly stated to be the most reliable (Myers & McCaulley, 1985). In contrast, people who prefer intuition over sensing to learn about the problem focus their attention on possibilities rather than observable facts and details. Intuitive types often consider meanings, relationships, and hidden possibilities beyond the reach of their senses. Intuitive types prefer dealing with abstract concepts and dislike detail oriented work (Lawrence, 1993). In general, sensing types prefer detail type information focusing on concrete facts and figures, while intuitive types prefer global type information and perceive problems holistically (Wolk & Nikolai, 1997; Bradley & Hebert, 1997; Ho & Rodgers, 1993; Cheng, Luckett, & Schultz, 2003). Thus, sensing types should search a higher proportion of information in a dimension/alternative setting than individuals that are intuitive types. If an individual’s dominant personality style is perception, they ‘‘prefer to keep decisions open as long as possible before doing anything irrevocable because they do not know nearly enough about it yet’’ (Myers, 1980). These individuals prefer to be in the information-gathering mode compared to a decision-making mode. In contrast, individuals with a dominant judgment personality style like to have matters settled and decided as quickly as possible so that they will know what is going to happen, can plan for it, and be prepared for it (Myers, 1980). These individuals prefer to be in a decisionmaking mode evaluating information to render a decision, rather than
106
CHARLES F. KELLIHER AND LOIS S. MAHONEY
gathering information. In summary, individuals with judgment as their dominant personality style prefer to gather the minimum of information before making a decision, while individuals with perception as their dominant personality style tend to gather more information than they need to make a decision. If these constructs of the Jungian typology translate into propensities to act, decision makers with a dominant perception personality style should search more information than decision makers with a dominant judgment personality style. The following research hypotheses are advanced: H1A. The interaction of dominant perception personality style and perception type will affect the proportion of information searched. H1B. Individuals with a dominant perception personality style will have a higher proportion of search compared to individuals with a dominant judgment personality style. H1C. Individuals with a sensing type will have a higher proportion of search compared to individuals with an intuitive type. Variability in the Proportion Searched Across Alternatives An individuals’ cognitive style also influences the way they search information across alternatives. Sensing types develop acute powers of observation, have memory for facts and detail, and have a propensity to move in a systematic, step-by-step manner, tying each new fact to past experiences, and testing its relevance in practical use (McCaulley, 1978). They prefer sequentially organized activities and an established way of doing things (Schloemer & Schloemer, 1997). Thus, sensing types are expected to gather the same information for all alternatives before making their decision. On the other hand, intuitive types, who like to consider meanings and relationships in the information they gather, are expected to look at a piece of information and let inspiration determine the next piece of information they examine. Because they are inspired to examine different pieces of information for each situation, they are not expected to examine the same information for each alternative. Similarly, if individuals have a perception personality style and are in the information-gathering mode, they are expected to systematically gather all the necessary information for each alternative before making their decision. Thus, they would search relatively the same information for all alternatives. Individuals with a judgment personality style, who prefer to make decisions but are currently forced into an information-gathering mode, will likely
Effects of Information Load and Cognitive Style
107
examine different pieces and amounts of information for each alternative in an attempt to come to closure quicker in order to render their decision. Thus, individuals who have a dominant perception personality style or have a sensing perception type are expected to look at relatively the same information for each alternative before rendering their decision. On the other hand, individuals with a dominant judgment personality style will look at only enough information for each alternative until they are comfortable making a decision. Similarly, intuitive types will let inspiration guide them to the next piece of information to examine, thus varying the information they look at for each alternative. This leads us to the second set of hypotheses: H2A. The interaction of dominant perception personality style and perception type will affect variability of search. H2B. Individuals with a dominant perception personality style will have a lower variability of search compared to individuals with a dominant judgment personality style. H2C. Individuals with a sensing type perception will have a lower variability of search compared to individuals with an intuitive type. Average Processing Time Perception type and dominant personality style should also influence the time spent processing information. Sensing types prefer to assimilate facts quickly, finding out about what exists. Intuitive types, on the other hand, are better at exploring possibilities, reading between the lines, and placing a greater emphasis on imagination and inspiration. These characteristics were reported by Vassen, Baker, and Hayes (1993) who found that intuitive type auditors took significantly longer performing an audit judgment task than sensing type individuals. In regard to dominant personality style, individuals with a dominant perception personality style prefer to keep decisions open as long as possible before doing anything irrevocable because they do not know nearly enough about it yet (Myers, 1980). Judgment personality style individuals like to have matters settled and decided as promptly as possible, so that they will know what is going to happen, can plan for it, and be prepared for it. Once they feel they have enough information to render a decision, they immediately stop and move onto the next decision and are not concerned whether all their decisions are based upon the same information. If the
108
CHARLES F. KELLIHER AND LOIS S. MAHONEY
perception personality style is dominant, the decision maker prefers to be in the information-gathering mode compared to a decision-making mode. Conversely, if the judgment personality style is dominant, the decision maker prefers to be in a decision-making mode evaluating information to render a decision, rather than gathering information to render a decision. These differences should be reflected in the time spent processing the information. Intuitive types and dominant perception personality style individuals should spend more time per informational cue exploring possibilities compared to sensing types and dominant judgment personality style individuals. This leads us to the following research hypotheses: H3A. The interaction of dominant perception personality style and perception type will affect processing time. H3B. Individuals with a dominant perception personality style will have a higher average processing time compared to individuals with a dominant judgment personality style. H3C. Individuals with an intuitive type will have a higher average processing time compared to individuals with a sensing type. Information Load and Search Strategy Research has shown that information search strategies are a function of information load (Swain & Haka, 2000; Stocks & Harrell, 1995; Biggs & Wild, 1985). The importance of this measure is that the direction of search affects the order that the alternatives (or dimensions) are searched and subsequently eliminated. This may affect their final choice. Decision makers become more deliberate processing information as the information load increases because more comparisons are necessary to differentiate and integrate the new information into the cognitive representation of the task they are constructing in memory (Shugan, 1980; Iselin, 1988). They respond to increases in information load by processing more information until they reach their limits on the amount of information that they can process (Stocks & Tuttle, 1998; Stocks & Harrell, 1995). Payne (1976) found information search strategy was not only contingent on information load, but also on the type of information load – dimensions versus alternatives. When faced with a smaller amount of information, decision makers employed high processing strategies searching most of the available information for both dimensions and alternatives. As the information load increased, decision makers employed simpler processing
Effects of Information Load and Cognitive Style
109
strategies largely to eliminate some alternatives after only a limited search of the dimensions. The effect from increasing the number of dimensions was greater than the effect from increasing the number of alternatives. As information load increases, search patterns are expected to shift from strategies that involve a constant search per alternative to a varied search across dimensions (Anderson, 1988; Biggs et al., 1985; Payne, 1976). Since sensing types prefer to gather, remember, and work with many facts in a systematic, step-by-step manner, their search patterns are expected to better accommodate the additional comparisons necessary to integrate the increase in information load. Intuitive type individuals, who focus their attention on possibilities rather than observable facts and details, are likely to significantly alter their search patterns to accommodate the increase in information load before making their decision. Judgment dominant personality style individuals, who like closure and only look at enough information for each alternative until they are comfortable making a decision, should be better able to adapt to the increase in information load and should not need to change their search behaviors. Perception dominant personality style individuals, who like to search all the necessary information for each alternative before making a decision, should be forced to adapt their search strategies more in order to arrive at a decision with the increase in information load. The last hypothesis examines the interaction between cognitive style and increase information load, both alternatives and dimensions, on information search strategy: H4A. As information load increases, the interaction of dominant perception personality style and alternatives will affect the search strategy. H4B. As information load increases, the interaction of dominant perception personality style and dimensions will affect search strategy. H4C. As information load increases, the interaction of perception type and alternatives will affect search strategy. H4D. As information load increases, the interaction of perception type and dimensions will affect search strategy.
METHOLOGY A laboratory experiment that required the participants to search sequentially for information was developed to test the four hypotheses. Explicit information search techniques and the search characteristic measures
110
CHARLES F. KELLIHER AND LOIS S. MAHONEY
developed by Payne (1976) and Shields (1980, 1983) were used to examine search behavior at various levels of information load (defined by number of alternatives and dimensions). The MBTI was used to identify Jungian personality traits to test for individual cognitive differences.1 Experimental Task Participants were provided with operating information regarding a number of different stores. The decision task required participants to choose the store with the worst operating performance. Performance dimensions for each store were provided to the participants to use in evaluating the operating performance. Four independent decision tasks were developed in matrices that contained m alternatives (stores) and n dimensions (performance measures). The matrix (row column) format represented the typical presentation of information in a management performance report. Each performance report contained either six or twelve alternatives (stores), or either six or twelve dimensions (performance measures).2 This resulted in four experimental tasks with an information load varying from a low of 36 cues (6 alternatives 6 dimensions) to a high of 144 cues (12 alternatives 12 dimensions). The performance measures (dimensions) and their qualitative values were developed and modified in a series of steps that included discussions with an experienced retail manager and business policy instructors at the university, and pilot testing the materials using graduate students. Each cell was assigned a level so that each store (alternative) had both favorable and unfavorable qualities to avoid an obvious choice (Swain & Haka, 2000). If a particular store (alternative) was clearly the ‘‘correct’’ choice, participants would not be likely to engage in a careful and intensive information search. Participants were not expected to converge to any particular store, and posttest analysis of participants’ selection did not reveal a dominant choice for any of the tasks. The final case materials were reviewed again to provide assurance that the materials were reasonable and contained familiar, unambiguous terminology that the participants could understand. Table 1 contains a list of the performance measures (dimensions) and their respective qualitative levels. Instrument The experiment was conducted using a computerized information display board. All values contained in each cell of the matrix were hidden from the
Effects of Information Load and Cognitive Style
Table 1.
111
Performance Measures Used in Reports (and Qualitative Levels).
Panel A: Six performance-measure reports Dollar sales per transaction Inventory shrinkage Gross margin percentage Adjusted trend in sales Inventory turnover Volume of store traffic
Levels Low – moderate – high Low – moderate – high Low – moderate – high Upward – level – downward Low – moderate – high Low – moderate – high
Panel B: Additional six performance measures Amount spent on advertising Previous performance of the store Local market share Employee turnover Number of customer complaints Credit policies
Levels Low – moderate – high Above average – average – below average Low – moderate – high Low – moderate – high Low – moderate – high Aggressive – passive
participants. The user selected a performance measure and a store to reveal one cue at a time until they believed that they had evaluated sufficient information to select the store with the worst operating performance. The procedure was repeated until the participant had evaluated each of the four experimental tasks presented in random order. The software unobtrusively recorded their information search activity, as it kept track of what information was selected, the order that it was selected, and the time between selections. Conduct of Experiment Participants were senior level business students enrolled in a capstone business policy course at a four-year university. The MBTI questionnaire was administered at the beginning of a regularly scheduled class. In order to include participants with a clear perception type preference, only participants with a minimum difference between their sensing and intuition score of at least ten points were asked to continue in the study.3 Forty-eight volunteers signed up to complete the experiment outside class. At the beginning of the experimental session, each participant received details of the task (background information, summary of operations, and description of the performance measures), received familiarization training with both the format and content of a performance report, and received instructions regarding the commands necessary to interact with the computerized information display board.4 Then each participant evaluated four stores by
112
CHARLES F. KELLIHER AND LOIS S. MAHONEY
three performance measures practice decision task. The purpose was to make sure that the participants understood the instructions and the task, and was able to use the computer assisted information board. Each participant then completed the four required tasks (administered in random order). At the conclusion of the experiment, each participant answered a short exit questionnaire and received $5.00 as compensation.5 Independent Variables Information load is the primary independent variable of interest in this study. The amount of information load in a performance report was varied experimentally through two factors: alternatives (6 or 12 stores to evaluate) and dimensions (6 or 12 performance measures used to describe each alternative). The research investigated whether a decision maker’s information acquisition behavior was dependent on their personality preferences as measured by the MBTI. The MBTI was used because of its widespread usage in prior research and counseling, and because many studies have examined its reliability and validity (McCaulley, 1978; Myers, 1962).6 Two independent variables were defined for isolating the relevant cognitive style dimensions: dominant personality style (perception or judgment) and perception type (sensing or intuition). Dependent Variables Four dependent variables, commonly used for information search behavior research (Swain & Haka, 2000; Cook, 1993), were calculated for each decision task from the detailed record of what information was selected, the order of selection, and the time each cue was examined: Proportion of Information Searched – amount of information selected divided by the amount of information available Variability in the Proportion Searched Across Alternatives – the variation in the number of dimensions searched across alternatives, measured by the standard deviation Averaging Processing Time – the average time spent examining each information item Information Search Strategies7 – a measure of whether decision makers tend to search across the dimensions and within an alternative
Effects of Information Load and Cognitive Style
113
(inter-dimensional search), or search within a particular dimension and across alternatives (intra-dimensional search) Data Analysis Several steps were used to analyze the data and test the hypotheses. Initially, a mixed within/between ANOVA8 was used to test for significant main effects and two-way interactions (Keppel, 1982; Swain & Haka, 2000). The between subject factors were dominant personality style and perception type, and the repeated within subject factors were alternatives (stores) and dimensions (performance measures). Additional planned contrasts were constructed to test the between-subject interactions. A separate, univariate analysis was run to test for within subject effects for each dependent variable.
RESULTS AND INTERPRETATIONS The following section reports the means and the repeated measures ANOVA for each dependent variable. The statistical tests used to investigate each hypothesis also are discussed. The independent variable of cognitive style is operationalized as dominant personality style and perception type, while the independent variable of information load is operationalized as the number of alternatives and the number of dimensions. Thus, both information load and cognitive style are analyzed as two separate independent variables. Hypotheses 1: Proportion of Information Searched Table 2 reports the means for the proportion of information searched across alternatives. H1A is not supported as the repeated measures ANOVA reports that the interaction between dominant personality style and perception type is not significant (F=1.67; p=0.2011). H1B is supported as the proportion of information searched is significantly related to dominant personality style (F=6.60; p=0.0135). In all four tasks the proportion of information searched by decision makers with a dominant perception personality style was higher than it was for decision makers with a dominant judgment personality style (by an average of 8.4% over the four tasks). The univariate analysis and LSM tests revealed that the differences were significant in two out of the four tasks ( p=0.0382 for 12 alternatives 12 dimensions task;
114
CHARLES F. KELLIHER AND LOIS S. MAHONEY
Table 2.
Proportion of Information Searched.
Panel A: Mean values of proportion of information searched Alternatives 6 6 Dimensions 6 12 Sensing Perception 71.1 55.0 Judgment 59.0 41.7 Intuition Perception 65.5 44.2 Judgment 64.1 44.7 Sensing means Intuition means Perception means Judgment means Overall means
65.1 64.8 68.3 61.6 64.9
48.4 44.5 49.6 43.2 46.4
12 6
12 12
61.1 44.4
36.7 28.3
58.3 50.5
36.9 29.1
52.8 54.4 59.7 47.5 53.6
32.5 33.0 36.8 28.7 32.8
Panel B: Repeated measures ANOVA Sources of Variation Between subjects Perception type (P) Dominant personality style (S ) PS Within subjects Alternatives (A) AP AS Dimensions (D) DP DS AD
F-ratio
p-value
0.02 6.60 1.67
0.8806 0.0135 0.2011
55.05 0.85 1.14 85.33 0.30 0.27 0.37
0.0001 0.3572 0.2874 0.0001 0.5810 0.6042 0.5499
and p=0.0149 for 12 alternatives 6 dimensions task). These results are consistent with Myers’s personality profiles. Decision makers with a dominant perception personality style, who prefer to keep the decision open, searched more of the available information to avoid missing any detail. In contrast, decision makers with a dominant judgment personality style, who prefer to have matters settled more quickly so they will know what is going to occur and can plan for it, selected less information. H1C is also not supported for perception type is not significant (F=0.02, p=0.8806). Overall, H1B is supported as dominant perception personality style individuals search a higher proportion of information than dominant judgment personality style and no support exists for H1A and H1C.
Effects of Information Load and Cognitive Style
115
Hypotheses 2: Variability in the Proportion Searched Across Alternatives Table 3 reports the means for the variability in the proportion of information searched across alternatives.9 The repeated measures ANOVA reported that there was a significant interaction between dominant personality style and perception type (F=4.86; p=0.0325), thus supporting H2A. Fig. 1 shows that for sensing individuals there was a significant difference in the variability of search measure depending on whether perception or judgment was the dominant personality style (planned contrast F=12.38; p=0.0011). For intuitive types, there was no significant Table 3.
Variability in the Proportion of Information Searched across Alternatives.
Panel A: Mean values of variability in the proportion searched across alternatives Alternatives 6 6 12 Dimensions 6 12 6 Sensing Perception 11.5 10.6 13.7 Judgment 23.2 19.9 25.8 Intuition Perception 18.0 17.0 20.6 Judgment 19.1 17.0 22.3 Sensing means Intuition means Perception means Judgment means Overall means
17.4 18.6 14.7 21.2 18.0
15.3 17.0 13.8 18.4 16.1
19.8 21.5 17.1 24.1 20.6
12 12 7.4 19.6 16.3 18.2 13.5 17.3 11.8 18.9 15.4
Panel B: Repeated measures ANOVA Sources of Variation Between subjects Perception type (P) Dominant personality style (S ) PS Within subjects Alternatives (A) AP AS Dimensions (D) DP DS AD
F-ratio
p-value
0.86 7.36 4.86
0.3615 0.0094 0.0325
1.00 0.45 0.60 10.56 0.33 0.15 3.46
0.3173 0.5106 0.4390 0.0021 0.5601 0.6974 0.0681
116
CHARLES F. KELLIHER AND LOIS S. MAHONEY
24 22
Variability
20 18 16 14 12 Perception 10
Judgment
8 Sensing
Intuition Perception type
Fig. 1.
Variability of Search: Dominant Personality Style Perception Type.
difference regardless of the dominant personality style. The upward sloping line shows a significant difference between sensing and intuitive types when perception was the dominant personality style (planned contrast F=4.90; p=0.0321). There was no significant difference between sensing and intuitive types when judgment was the dominant personality style (the relatively flat line). Table 3 also reported a significant relationship between the variability of search and dominant personality style (F=7.36; p=0.0094). However, when examining the cell means in Table 3, the differences between perception and judgment personality style only exists for sensing types, thus indicating that the interaction affect overrides the main effect and H2B is not supported. Table 3 also shows that H2C is not supported as perception type is not significant (F=0.86, p=0.3615). Overall, H2A is supported as there is a significant interaction effect between dominant perception personality style and perception type as sensing type individuals with perception dominant personality style had a lower variability of search compared to sensing type individuals with judgment dominant personality style. H2B and H2C are not supported. Hypotheses 3: Average Processing Time Table 4 presents the means for the average processing time measure. The repeated measures ANOVA reported that there was a significant interaction
Effects of Information Load and Cognitive Style
117
Table 4. Average Processing Time. Panel A: Mean values of average processing time Alternatives 6 Dimensions 6 Sensing Perception 12.9 Judgment 17.2 Intuition Perception 17.8 Judgment 17.1 Sensing means Intuition means Perception means Judgment means Overall means
15.1 17.5 15.3 17.2 16.2
6 12
12 6
12 12
15.0 20.1
13.6 17.8
14.3 20.6
19.2 19.1
17.5 17.3
20.3 16.3
17.6 19.2 17.1 19.6 18.3
15.7 17.4 15.5 17.5 16.5
17.5 18.3 17.3 18.4 17.9
Panel B: Repeated measures ANOVA Sources of Variation Between subjects Perception type (P) Dominant personality style (S ) PS Within subjects Alternatives (A) AP AS Dimensions (D) DP DS AD
F-ratio
p-value
2.24 2.93 8.11
0.1404 0.0945 0.0067
0.01 0.29 0.20 5.67 0.35 0.01 0.36
0.9165 0.5962 0.6484 0.0212 0.5538 0.9379 0.5565
between dominant personality style and perception type (F=8.11; p=0.0067), thus supporting H3A. Fig. 2 graphs the interaction between dominant personality style and perception type. Looking first at individuals with a dominant perception personality style (the upward sloping line), sensing types had a significantly lower average processing time compared to intuitive types (planned contrast F=9.45; p=0.0036). That difference reverses when one views the dominant judgment personality style individuals (the downward sloping line), which shows a slightly higher average processing time for sensing types compared to intuitive type. For sensing individuals, there was a significant lower average processing time depending on whether perception or judgment was the dominant personality style (planned contrast F=10.38; p=0.0024). For intuitive types, there was
118
CHARLES F. KELLIHER AND LOIS S. MAHONEY
20
Average Time
19 18 17 16 15 14
Perception
13
Judgment
12 Sensing
Intuition Perception Type
Fig. 2.
Average Processing Time: Dominant Personality Style Perception Type.
no significant difference regardless of the dominant personality style. Again, this suggests that some cognitive factors may influence behavior only when it is acting as the dominant personality style providing overall direction to the individual’s personality. Sensing types, who prefer to gather factual data and spend less time analyzing information, seem to arrive at an answer faster only if their dominant personality style is perception. Table 4 also reports a marginally significant dominant personality style (F=2.93; p=0.0945). However, when examining the cell means in Table 4, the differences between perception and judgment personality style only exist for sensing types, thus indicating that the interaction effect overrides the main effect and H3B is not supported. Table 4 also shows that H3C is not supported as perception type is not significant (F=2.24, p=0.1404). Overall, H3A is supported as there is a significant interaction effect between dominant perception personality style and perception type as sensing type individuals with perception dominant personality style had a lower average processing time compared to sensing type individuals with judgment dominant personality style. H3B and H3C are not supported. Hypotheses 4: Information Search Strategies Table 5 shows that H4A is supported, as there is a significant interaction between alternatives and dominant personality style (F=5.17; p=0.0276), and H4C is marginally supported as the interaction between alternatives and perception type (F=3.11; p=0.0843). However, H4B and H4D are not supported as the interaction between dimensions and dominant personality
Effects of Information Load and Cognitive Style
119
Table 5. Information Search Strategies. Panel A: Mean values for information search strategies Alternatives 6 6 Dimensions 6 12 Sensing Perception 0.65 0.76 Judgment 0.71 0.67 Intuition Perception 0.39 0.56 Judgment 0.75 0.62 Sensing means Intuition means Perception means Judgment means Overall means
0.68 0.57 0.52 0.73 0.63
0.72 0.59 0.66 0.64 0.65
12 6
12 12
0.69 0.54
0.91 0.66
0.75 0.75
0.80 0.70
0.62 0.75 0.72 0.64 0.68
0.79 0.75 0.85 0.68 0.77
Panel B: Repeated measures ANOVA Sources of Variation Between subjects Perception type (P) Dominant personality style (S ) PS Within subjects Alternatives (A) AP AS Dimensions (D) DP DS AD
F-ratio
p-value
0.16 0.02 1.11
0.6941 0.8926 0.2973
3.15 3.11 5.17 1.23 0.84 2.46 0.35
0.0829 0.0843 0.0276 0.2729 0.3645 0.1240 0.5533
style (F=2.46; p=0.1240) and dimensions and perception type (F=0.84; p=0.3645) are not significant. Fig. 3 graphs the interaction between alternatives and dominant personality style. The negative slope for individuals with a dominant perception personality style indicates that when faced with more alternatives, they changed their research strategies more and relied more heavily on searching within a particular dimension across alternatives. Again, since these decision makers searched more of the information, perhaps this more systematic strategy enabled them to process the information more efficiently. In contrast, there was little difference in the direction of search measure for decision makers with a dominant judgment personality style who selected less information overall
120
CHARLES F. KELLIHER AND LOIS S. MAHONEY
-0.50 Perception
Direction
-0.55
Judgment
-0.60 -0.65 -0.70 -0.75 -0.80
6
12 Alternatives
Fig. 3.
Search Strategy: Alternatives Dominant Personality Style.
-0.50 Sensing
-0.55
Intuition Direction
-0.60 -0.65 -0.70 -0.75 -0.80
6
12 Alternatives
Fig. 4.
Search Strategy: Alternatives Perception Type.
(the relatively flat line). Their decision-making behavior is not significantly affected by increases in the number of alternatives. Similarly, Fig. 4 plots the interaction between alternatives and perception type. There are differences in the direction of search between sensing and intuitive types in the smaller tasks containing only six alternatives. In these tasks, sensing types relied more on searching within a particular dimension across alternatives than did intuitive types. However, the direction of search measures for both perception types converged when there were twelve
Effects of Information Load and Cognitive Style
121
alternatives to evaluate. Overall, intuitive types changed their strategy from searching across dimensions and within an alternative to searching within a dimension across alternatives (as indicated by the negative slope), while the sensing type’s search behavior changed very little (the relatively flat line).
SUMMARY, CONTRIBUTIONS, LIMITATIONS, AND EXTENSIONS The findings suggest the importance of considering information load and cognitive characteristics on the information processing behaviors employed by decision makers. For a management accounting task, the findings indicate that increases in the supply of information and the cognitive characteristics of the decision maker affect information processing behavior. Specifically, the proportion of information searched is significantly related to dominant personality style while the variability and time to search interact with dominant personality style and perception type. Furthermore, information search strategies are affected by the interaction of the information load and perception type and information load and dominant personality style. The results show that decision makers with a dominant personality style of perception search a higher proportion of information than judgment dominant decision makers. This appears to be consistent with Myers’s personality profiles as sensing types employ reduced processing strategies to a lesser degree than intuitive types. Differences in the information processing behaviors of variability and time between sensing and intuitive types relate only to those individuals with a dominant perception personality style (i.e., who prefer to be in an information-gathering mode). The impact of these two personality preferences on information search behavior is not evident, nor predictable, when judgment is the dominant personality style.10 Dominant perception personality style decision makers, characterized as a sensing perception type, search less information in a shorter time in comparison to decision makers with an intuition perception type. Dominant personality style and perception type are also not independent of the task, as evidenced by a significant interaction between dominant personality style and the number of alternatives and marginally significant interaction between perception type and number of alternatives. As the information load increases by adding more alternatives, decision makers with a dominant perception personality style and intuition perception
122
CHARLES F. KELLIHER AND LOIS S. MAHONEY
types adjust their search more by relying more heavily on searching within a particular dimension across alternatives to process the information. This also appears to be consistent with Myers’s personality profiles. The above findings suggest that organizations may benefit from understanding the cognitive style of employees. With a growing number of companies turning to psychological testing for hiring and training purposes (for example Kamen, 1997), such categorizations are becoming increasingly feasible. The availability of information on employee cognitive styles could therefore assist organizations in assigning employees to manage tasks that involve varying amounts of information. The results have potential implications for accountants, both as providers of financial information and in their advisory role to management. The education and training of accountants is intended to ensure competence; however, education and training may not always ensure efficient and effective decision-making. The search process and subsequent judgments are affected by differences in the cognitive characteristics of accountants. Sensing/judgment types are the predominant personality types found in the accounting profession (Schloemer & Schloemer, 1997), and these personality types are well suited to many tasks performed by accountants (Myers & McCaulley, 1985). However, the expanding services of modern CPA firms suggest that individuals with a broader range of cognitive characteristics may be needed. Lack of diversity may result in failure to consider all aspects of a decision leading to poor decisions and missed opportunities (Kovar, Ott, & Fisher, 2003). While the explicit information search techniques used in the present study provide a very rich analysis of the underlying decision processes, several problems may limit the generalization of the findings. This method focuses on participants’ selection of ‘‘external’’ information and does not attempt to explain how the information interacts with information stored ‘‘internally’’ in memory. The obtrusive process necessary to collect the data (computer assisted information boards requiring participants to select serially the information to be evaluated) may alter search behavior. As Arch, Bettman, and Kakkar (1978) noted, ‘‘one cannot measure a process without impacting the process.’’ Additionally, this study did not examine the accuracy and consistency of decisions or the impact of time constraints on decisionmaking. Therefore, determining how judgment performance between cognitive abilities types compared is not possible but simply performance did differ according to cognitive ability. Hopefully this study will serve as a catalyst for future replications, extensions, and refinements that further the understanding of the underlying
Effects of Information Load and Cognitive Style
123
decision process. More work must be conducted using the explicit information search techniques with other populations and other decision tasks before valid generalizations can be made. Researchers also might consider the relationship, if any, between the underlying decision process and the following variables: (1) the form, content, and organization of the information set; (2) other levels and definitions of information load; (3) the similarity of the alternatives; (4) the proportion of discriminating cues in the information set; (5) the time allocated or available to make the decision; (6) other facets of human information processing theory; and (7) other personality constructs.11 Future descriptive judgment models of the underlying decision process should account for both the characteristics of the task environment and the decision maker. Additional research in this area should seek to increase the understanding of the decision making process and to communicate this knowledge to decision makers to improve their self-insight. Research aimed at conveying the importance of seemingly minor changes in task structure and individual biases should be undertaken to improve the content of training programs and assist in the development of decision aids.
NOTES 1. The MBTI has been used by many decision theory researchers to study decision patterns of individuals with their cognitive style (Cheng et al., 2003; Barkhi, 2002; Nutt, 1989; Rampasad & Mitorff, 1984; Robey & Taggart, 1981; Casey, 1980). Reliability and validity for the dichotomous preferences has been high (Wheeler, 2001), especially among college students (Myers & McCaulley, 1985). Numerous researchers have recognized their wide use and intuitive appeals as reason for continued use of this instrument in research (Kovar et al., 2003). 2. These two levels (6 and 12) were selected after an extensive review of the existing literature. Performance reports smaller than 6 6 were not examined because prior research reported that participants searched most of the information, which would eliminate differences in individual search behavior relevant to the present study. Performance reports larger than 12 12 were not considered to reduce participant fatigue while still maintaining differences in individual search behavior. 3. The MBTI Manual uses the following adjectives to describe the strength of reporting the preference: 1–9=slight; 10–19=moderate; 20–39=clear; and >39=very clear. 4. A written script of the verbal instructions was prepared in advance so that each participant received the same directions. The verbal instructions walked each participant through the entire process including the practice case. 5. The results of the exit questionnaire indicated that the participants understood the decision task and instructions, had no difficulty interacting with the
124
CHARLES F. KELLIHER AND LOIS S. MAHONEY
computer-assisted information board, and felt they were prepared to make this type of business decision. 6. The Center for Applications of Psychological Type (1985) lists well over 1,000 empirical studies that have used the MBTI in a wide variety of contexts. 7. Using the direction of search measure developed by Payne (1976), we computed the following ratio: number interdimensional shifts number intradimensional shifts number interdimensional shifts þ number intradimensional shifts Where a pattern of only interdimensional shifts would have a value of +1, a pattern consisting of only intradimensional shifts would have a value of 1. 8. The dependent variables were examined for normality using graphical methods, descriptive plots, and theoretical statistics. Some positive skewness was present in the data so the dependent variables were ‘‘normalized’’ using logarithm and square root transformations. Since the subsequent analyses were nearly identical to original tests using the raw data, the results from the original data are reported in the paper. This makes it easier to interpret the means reported in the paper. 9. The range of variability measures is from 0.0 to 0.50; the maximum variability is attained if one-half of the alternatives were searched on all of the dimensions and the other half of the alternatives were not searched at all. 10. This is evident by examining the number of significant interactions between the judgment factor and the perception factor. 11. For example motivation, attention and perception, information evaluation, use of memory, decision rules and processes, consumption and learning (Bettman, 1979).
REFERENCES American Institute of Certified Public Accountants. (2000). Vision project. New York: AICPA. Anderson, M. J. (1988). A comparative analysis of information search and evaluation behavior of professional and non-professional financial analysts. Accounting, Organizations and Society, 13(5), 431–446. Arch, D., Bettman, J., & Kakkar, P. (1978). Subjects’ information processing in information display board studies. In: K. Hunt (Ed.), Advances in consumer research (Vol. 5, pp. 555–560). Urbana, III: Association for Consumer Research. Barkhi, R. (2002). Cognitive style may mitigate the impact of communication mode. Information and Management, 39, 677–688. Bettman, J. (1979). An information processing theory of consumer choice. Reading, MA: Addison-Wesley. Biggs, S., Bedard, J., Gaber, B., & Linsmeier, T. (1985). The effects of task size and similarity on the decision behavior of bank loan officers. Management Science, 31(8), 970–987. Biggs, S., & Wild, J. (1985). An investigation of auditor judgment in analytical review. The Accounting Review, 60(4), 607–633. Bradley, J. H., & Hebert, F. J. (1997). The effect of personality type on team performance. Journal of Management, 16(5), 337–353.
Effects of Information Load and Cognitive Style
125
Casey, C. (1980). The usefulness of accounting ratios for subjects’ predictions of corporate failure: Replication and extensions. Journal of Accounting Research, 18(2), 603–613. Center for Applications of Psychological Type. (1985). Myers-Briggs type indicator: Bibliography. Gainesville, FL: CAPT. Cheng, M., Luckett, P., & Schultz, A. (2003). The effects of cognitive style diversity on decision making dyads: An empirical analysis in the context of a complex task. Behavioral Research in Accounting, 15, 39–61. Chenhall, R., & Morris, D. (1991). The effect of cognitive style and sponsorship bias of the treatment of opportunity costs in resource allocation decisions. Accounting, Organization and Society, 16, 27–46. Chewning, E. G., & Harrell, M. (1990). The effect of information load on decision makers’ cue utilization levels and decision quality in a financial distress decision task. Accounting, Organizations and Society, 15(6), 527–542. Cook, G. J. (1993). An empirical investigation of information search strategies with implications for DSS design. Decision Sciences, 24(3), 683–697. Cook, G. J., & Swain, M. R. (1993). A computerized approach to decision process tracing for decision support design. Decision Science, 24(5), 931–952. Ford, J. K., Schmitt, N., Schlechtman, S. L., Hults, B. M., & Doherty, M. L. (1989). Process tracing methods: Contributions, problems, and neglected research questions. Organizational Behavior and Human Decision Processes, 43, 75–117. Ho, J. L., & Rodgers, W. (1993). A review of accounting research on cognitive characteristics. Journal of Accounting Literature, 12, 101–130. Hollander, A. S., Denna, E. L., & Cherrington, J. O. (2000). Accounting, information technology, and business solutions (2nd ed.). New York, NY: McGraw-Hill Companies, Inc. Iselin, E. (1988). The effects of information load and information diversity on decision quality in a structured decision task. Accounting Organizations and Society, 13(2), 147–164. Jung, C. (1923). Psychological types, (Vol. 6). Princeton, NJ: Princeton University Press. Kamen, R. (1997). Psych selection. Journal of Business Strategy, 18(2), 22–27. Keppel, G. (1982). Design and analysis: A researcher’s handbook (2nd ed.). Englewood Cliffs, NJ: Prentice Hall, Inc. Kovar, S. E., Ott, R. L., & Fisher, D. G. (2003). Personality preferences of accounting students: A longitudinal case study. Journal of Accounting Education, 21, 75–94. Lawrence, G. (1993). People types and tiger stripes: A practical guide to learning styles. Gainesville, FL: Center for Applications of Psychological Type, Inc. McCaulley, M. (1978). Application of the Myers-Briggs type indicator to medicine and other health professions. Gainesville, FL: CAPT. Mock, T., & Vasarhelyi, M. (1984). Context, findings and method in cognitive style research: A comparative study. In: S. Moriarity & E. Joyce (Eds), Decision making and accounting: Current research. University of Oklahoma, Norman, OK: Centre for Economic and Management Research. Myers, I. (1962). The Myers-Briggs type indicator (the manual). Palo Alto, CA: Consulting Psychologists Press, Inc. Myers, I. (1980). Introduction to type. Palo Alto, CA: Consulting Psychologists Press, Inc. Myers, I. B., & McCaulley, M. H. (1985). Manual: A guide to the development and use of the Myers-Briggs indicator. Palo Alto, CA: Consulting Psychologists Press. Nutt, P. C. (1989). Uncertainty and culture in bank loan decisions. International Journal of Management Science, 17(3), 297–308.
126
CHARLES F. KELLIHER AND LOIS S. MAHONEY
Payne, J. (1976). Task complexity and contingent processing in decision making: An information search and protocol analysis. Organizational Behavior and Human Performance, 16(2), 366–387. Payne, J., Bettman, J. R., & Johnson, E. J. (1993). The adaptive decision maker. New York: Cambridge Press. Rampasad, A., & Mitorff, I. (1984). On formulating strategic problems. Academy of Management Review, 9(4), 597–605. Robey, D., & Taggart, W. (1981). Measuring manager’s minds: The assessment of style in human information processing. Academy of Management Review, 6(3), 375–383. Schloemer, P. G., & Schloemer, M. S. (1997). The personality types and preferences of CPA firm professionals: An analysis of changes in the profession. Accounting Horizons, 11(4), 24–39. Shields, M. (1980). Some effects of information load on search patterns used to analyze performance reports. Accounting Organizations and Society, 5(4), 429–442. Shields, M. (1983). Effects of information supply and demand on judgment accuracy: Evidence from corporate managers. The Accounting Review, 58(2), 284–303. Shugan, S. (1980). The cost of thinking. Journal of Consumer Research, 7, 99–111. Stocks, M. H., & Harrell, A. (1995). The impact of an increase in accounting information level on the judgment quality of individuals and groups. Accounting, Organization and Society, 20(7/8), 685–700. Stocks, M. H., & Tuttle, B. (1998). An examination of information presentation effects on financial distress predictions. Advances in Accounting Information Systems, 6, 107–128. Swain, M. R., & Haka, S. F. (2000). Effects of information load on capital budgeting decisions. Behavioral Research in Accounting, 12, 171–198. Vassen, E. H. J., Baker, C. R., & Hayes, R. S. (1993). Cognitive styles of experienced auditors in the Netherlands. British Accounting Review, 25, 367–382. Wheeler, P. (2001). The Myers-Briggs type indicator and applications to accounting education and research. Issues in Accounting Education, 16(1), 125–150. Wolk, C., & Nikolai, L. A. (1997). Personality types of accounting students and faculty: Comparisons and implications. Journal of Accounting Education, 15(1), 1–17.
AN ASSESSMENT OF THE CONTRIBUTION OF STRESS AROUSAL TO THE BEYOND THE ROLE STRESS MODEL Kenneth J. Smith, Jeanette A. Davy and George S. Everly, Jr. ABSTRACT Two key constructs, burnout and stress arousal, have emerged in the behavioral accounting literature over the past decade as viable mediators in the link between organizational stressors and key job-related and personal outcomes. However, despite the apparent conceptual distinctiveness of these two constructs, no one investigation has simultaneously examined their unique contribution to the stressor-outcome dynamic. This study attempts to replicate and test an expanded version of Fogarty et al.’s (2000) Beyond the Role Stress Model. The extension of the referent model assesses the extent to which both stress arousal and burnout are associated with key organizational stressors and job outcomes among public accountants. The premise for this investigation is that while burnout appears to represent the consequence of prolonged exposure to one or more stressors, stress arousal represents an immediate response to those same stressors. Thus, from a temporal perspective, stress arousal should receive consideration as a viable antecedent to burnout in models that examine the Advances in Accounting Behavioral Research, Volume 10, 127–158 Copyright r 2007 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1016/S1475-1488(07)10006-5
127
128
KENNETH J. SMITH ET AL.
stressor-outcome dynamic. With respect to the referent model, this paper proposes that stress arousal may be directly related to deleterious effects on job outcomes before burnout tendencies manifest themselves, and it may represent a direct influence on burnout as well as a mediating influence between sources of job stress and burnout. The results support the conceptual distinctiveness of the stress arousal and burnout constructs, as well as the proposition that stress arousal and subsequently burnout serve as important mediators of the relations between role stressors and the key outcomes of job satisfaction, performance, and turnover intentions.
INTRODUCTION Nearly a quarter-century ago, Weick (1983) proposed that stress represented a unifying construct for examining numerous issues relating to performance and individual well-being among accountants in the workplace. During the succeeding decade, numerous articles published in both the academic and professional literatures examined various relationships between purported stress measures and their antecedents and consequences among accountants in a variety of occupational settings. These studies focused heavily on organizational stressors (e.g., role conflict, role ambiguity, role overload) as direct antecedents to key job outcomes such as job satisfaction and performance (see, Collins & Killough, 1992; LePine, Podsakoff, & LePine, 2005; Smith & Everly, 1990 for reviews). Often, these studies produced mixed results (see Fogarty, Singh, Rhoads, & Moore, 2000, p. 36; LePine et al., 2005). Understanding and explaining these inconsistencies from both theoretical and practical perspectives is important. While stress has been a construct of central interest to behavioral accounting researchers, it will not be completely understood until its relationship with outcome variables such as performance is clarified. Practically speaking, organizations spend large amounts of resources attempting to manage stress. A better understanding of stress effects should be helpful in developing and implementing more useful stress management programs (LePine et al., 2005). Fogarty et al. (2000, p. 37) suggest that inconsistent findings in prior studies, which examined the unmediated effects of role stressors on job outcomes, may have been due to misspecification bias. This bias results from the omission of key variables that, in this case, might link role stressors and job outcomes. The introduction of key mediating variables that are related
Contribution of Stress Arousal to the Beyond the Role Stress Model
129
to both job stressors and job outcomes may reduce misspecification bias and enhance the explanatory power of stressors on outcome variables. Over the past decade, selected stressor-outcome studies in the accounting literature have expanded to incorporate key mediator variables in recognition of the aforementioned misspecification bias potential. Two key ‘‘stress’’ measures that have emerged as viable mediators in the link between organizational stressors and key job-related and personal outcomes, are burnout (e.g., see Fogarty et al., 2000) and stress arousal (e.g., see Smith, Davy, & Stewart, 1998). To date, no single investigation has simultaneously examined the contribution of these two constructs. Yet, by their definitions alone, they appear to be conceptually distinct. Stress arousal is defined as ‘‘a fairly predictable arousal of psychophysiological (mind-body) systems which, if prolonged, can fatigue or damage the system to the point of malfunction or disease’’ (Girdano & Everly, 1986, p. 5). The conceptual distinctiveness of stress arousal from organizational and other environmental stressors (e.g., work–home conflict) is important. Numerous studies in the organizational behavior literature document individual differences in susceptibility to stress, e.g., environmental factors, which cause excessive stress for one person but might have little or no impact on another. This is because a stressor must be perceived by an individual, and it must be perceived as threatening for stress arousal to occur (Lazarus & Folkman, 1984). Perception of a stressor as threatening evokes the stress process (LePine et al., 2005). The first step in that process is arousal (Smith et al., 1998). Moreover, the conceptual distinctiveness of stress arousal and key organizational stressors (e.g., role conflict, role ambiguity, and role overload) has been demonstrated along with the potential of each to significantly influence various job and personal outcomes in the public accounting work environment (Smith, Everly, & Johns, 1993; Smith, Davy, & Everly, 1995; Smith et al., 1998). Burnout in a job context is defined as a negative psychological response to work demands and/or interpersonal stressors (Almer & Kaplan, 2002; Cordes & Dougherty, 1993; Maslach, 1982). The burnout construct consists of three separate dimensions: emotional exhaustion, reduced personal accomplishment, and depersonalization. Emotional exhaustion is defined as a lack of energy and a feeling that one’s emotional resources are depleted (Cordes & Dougherty, 1993, p. 623). Reduced personal accomplishment is characterized by feelings of low self-esteem, low motivation, and the inability to perform satisfactorily. Depersonalization refers to an uncaring attitude toward others (e.g., clients, co-workers), and emotional detachment.
130
KENNETH J. SMITH ET AL.
Though both burnout and stress arousal are defined as a response to environmental stressors, burnout appears to represent the consequence of prolonged exposure to one or more of these stressors (Maslach & Schaufeli, 1993; LePine et al., 2005) that overwhelm individuals’ coping resources (Feldman & Weitz, 1988; Fogarty et al., 2000) resulting in dysfunctional effects on job outcomes. In addition, as Fogarty et al. (2000, p. 35) note, burnout is premised on the notion that specific role stressors may not be individually excessive, but their cumulative effect may be overwhelming. On the other hand, stress arousal represents an immediate response to those same stressors (Smith et al., 1998), and the physiological link between any given stressor and various outcomes (Everly & Lating, 2002; Selye, 1956). Smith, Davy, and Everly (2006) provide empirical evidence of the construct distinctiveness of stress arousal and burnout as measured in the present study. However, the relationship between stress arousal and burnout in the stressor-to-outcome dynamic has yet to be tested. It is therefore hypothesized that stress arousal is: (1) directly related to negative effects on job outcomes before burnout tendencies manifest themselves, and (2) has a direct influence on burnout as well as a mediating influence between sources of job stress and burnout. The purpose of this study is to replicate and test an expanded version of the Beyond the Role Stress Model proposed and tested by Fogarty et al. (2000).1 The referent study examined the relations among job stressors, burnout, and the job outcomes of satisfaction, performance, and turnover intentions among a sample of American Institute of Certified Public Accountants (AICPA) members in five states. This study will attempt to replicate their findings with a national sample of AICPA members employed in public accounting. It will then examine an expanded model that assesses the extent to which both stress arousal and burnout are associated, with key organizational stressors and job outcomes among public accountants. Specifically, this study will incorporate stress arousal into the referent model as a postulated mediating influence between job stressors and burnout, and as a direct influence on the aforementioned job outcomes. The balance of this paper is organized as follows: first, the theoretical and empirical underpinnings of the replication and expanded version of the referent model are described. Next, the methods employed to test the proposed models are discussed, followed by the results from the tests of the conceptual models along with a discussion of the significance of the results in terms of extending the Beyond the Role Stress Model. Finally, the study’s limitations and its importance to researchers and accounting organizations are discussed.
Contribution of Stress Arousal to the Beyond the Role Stress Model
131
EXTENDING THE BEYOND THE ROLE STRESS MODEL The Beyond the Role Stress Model proposed and tested by Fogarty et al. (2000) is used as the referent model, allowing replication of their findings. In order to expand on their results, stress arousal is included as a precursor to burnout (Lazarus & Folkman, 1984; Smith et al., 1995; Smith et al., 1998). Paths A through O (see Fig. 1) represent a direct test of the referent model. The mere existence of individual job stressors may not be particularly distressing. However, Fogarty et al. (2000) posit that the higher the level of conflict, ambiguity, and overload in the work environment, the more likely individuals will experience burnout symptoms (Paths A, B, C). In turn, burnout has been directly related to job outcomes, such as job satisfaction (Path D), turnover intentions (Path E), and performance (Path F). As does the referent study, we posit negative relationships between burnout and both job satisfaction and performance, and a positive relationship between burnout and turnover intentions. Finally, Fogarty et al. (2000) argue the direct, unmediated relationships between the three job stressors (role overload, conflict, and ambiguity) and job satisfaction (Paths G, H, I),
GB+
Role Overload A+
P+
H-
Job Satisfaction
T-
IW-
D-
J+ K+
Role Conflict
Q+
Stress Arousal
S+
Burnout
Turnover Intentions
E+
U+ R+ Role Ambiguity
FC+
V-
Performance O-
Fig. 1.
Expanded Beyond the Role Stress Model to be Tested.
L+
MNX+
132
KENNETH J. SMITH ET AL.
turnover intentions (Paths J, K, L) and performance (Paths M, N, O) would be very small or not significant, after controlling for the mediating influence of burnout. But there may be sufficient levels of stressors to still have some effect on job outcomes (Fogarty et al., 2000). Paths P through V represent the mediating effects of stress arousal between job stressors and burnout and between job stressors and job outcomes. Based on Lazarus and Folkman’s (1984) work, the three stressors (role overload, conflict, and ambiguity) are posited to be positively related to stress arousal (Paths P, Q, R). For stress arousal to occur, stressors must be perceived by an individual as threatening. As a result, these job stressors are predicted to be direct antecedents of stress arousal (Everly & Lating, 2002; Libby, 1983; Smith et al., 1995). Prolonged periods of stress arousal will produce consequences such as burnout. Again, burnout appears to represent the consequence of prolonged exposure to one or more stressors (Maslach & Schaufeli, 1993; LePine et al., 2005) that overwhelm individuals’ coping resources (Feldman & Weitz, 1988; Fogarty et al., 2000). The relationship between stressors and burnout is arguably mediated by stress arousal. If a stressor is not perceived as threatening, coping mechanisms will not be called on and burnout will not occur. As a result, Path S posits a positive relationship between stress arousal and burnout. Paths T, U, and V posit direct relationships between stress arousal and the outcomes, job satisfaction, turnover intentions, and performance. Negative relationships are posited between stress arousal and job satisfaction (T) and performance (V) and a positive relationship with turnover intentions (U). These paths are supported by previous research which argues that excessive stress can lead to dysfunctional outcomes (Libby, 1983; Michaels & Spector, 1982; Smith et al., 1995). Inclusion of these paths allow for direct tests of the mediating effect of stress arousal on the relationships between stressors and job related outcomes (Lazarus & Folkman, 1984; Smith et al., 1995) and the mediating effect of stress arousal on the relationships between stressors and burnout (Lazarus & Folkman, 1984) on these same outcomes. For proper specification of the model, Path W posits a negative relationship between job satisfaction and turnover intentions. This is consistent with prior research (Davy, Kinicki, & Scheck, 1991; Bullen & Flamholtz, 1985; Snead & Harrell, 1991). Finally, Path X posits a positive relationship between job satisfaction and performance. Research findings have been somewhat inconsistent in clarifying the relationship between these two constructs (Iaffaldono & Muchinsky, 1985). This study provides an opportunity to test this relationship in a somewhat different context than it has been tested before.
Contribution of Stress Arousal to the Beyond the Role Stress Model
133
METHODS Subjects and Procedure Subjects were selected from a mailing list provided by the American Institute of Certified Public Accountants (AICPA). A randomized sample of 2,500 members out of the approximately 91,333 institute members employed in public accounting were sent a research package containing the measures used in this study. One hundred sixty eight of the 2,500 instrument packages were returned as undeliverable. Five hundred sixty three usable responses (24%) were received within a pre-designated eight-week response period for inclusion in the study.2 Oppenheim’s (1966, p. 34) early-late hypothesis was used to assess non-response bias. Independent sample t-tests were conducted to assess the significance of mean score differences between the first 50 respondents and the final 50 respondents on each of the scales administered. No significant mean score differences between groups emerged from these analyses (at po.05), thus providing reasonable assurance that there was no significant non-response bias associated with the study. Of the 563 respondents, majority (58%) were male. Fifty nine percent (331) indicated that they were between 26 and 45 years old. Also, 80% (445) of those responding were married, and approximately 99% (557) reported the attainment of at least a bachelor’s degree level of education. Two hundred fifty three respondents (46%) indicated that they were partners, principals, or sole practitioners in accounting firms, and 294 (54%) indicated that they were staff members (juniors, seniors, or managers). Sixty-four percent (358) worked in offices with fewer than 100 professionals, and 53% of those responding (292) reported that they have worked at their present job for over seven years. Measures In order to facilitate replication of the Fogarty et al.’s (2000) Role Stress Model, each of the measures utilized in that study (pp. 41–42) are used as part of the current study. The role stressor measures were: 1. Role Ambiguity: three items from Rizzo, House, and Lirtzman’s (1970) 14-item Role Conflict and Role Ambiguity Scale. 2. Role Conflict: three items from Rizzo et al.’s (1970) 14-item Role Conflict and Role Ambiguity Scale. 3. Role Overload: five items from the Beehr, Welsh, and Tabor (1976) scale.
134
KENNETH J. SMITH ET AL.
The role stressor constructs were measured on five-point Likert scales. Fogarty et al. (2000, pp. 41–42) discuss the acceptability of these measures in terms of their psychometric properties as reported in prior research. The key outcome measures were: 1. Performance: a six-item scale drawn from Dubinski and Mattson (1979). 2. Turnover intentions: three items drawn from Donnelly and Ivancevich (1975). 3. Job Satisfaction: 27-items drawn from the Churchill, Ford, Hartley, and Walker (1985) scale. Each of the key outcome measures was measured using five-point Likert scales. Fogarty et al. (2000, p. 42) discuss the acceptability and conceptual desirability of these measures. With respect to job satisfaction, they note the multi-dimensional nature of the Churchill et al. (1985) scale and its selection over other measures due to ‘‘y its superior consideration of more specific aspects of jobs.’’ The final measure utilized from the referent study was the 24-item multidimensional role-specific (MROB) version of the Maslach Burnout Inventory (MBI) as developed by Singh, Goolsby, and Rhoads (1994). Fogarty et al. (2000, p. 41) note that: (1) this version of the scale spans three conceptual dimensions and four role members (co-workers, customers, immediate supervisor, and top management); and, (2) the favorable psychometric properties of this revised burnout measure. In accord with the referent study, the current study utilized a six-point Likert scale version of the MROB for each of the 24 scale items. Stress arousal was measured using 17 items drawn from the Stress Arousal Scale (SAS) developed by Everly, Sherman, and Smith (1989). The original SAS contains 20 items designed to tap the respondent’s cognitive-affective domain, i.e., the precipitators of the physiological stress response, thereby allowing an indirect assessment of one’s level of stress arousal.3 The conditions that define emotional arousal (as measured on the SAS) are highly correlated with stress-related physical symptoms (Lazarus & Folkman, 1984; Everly & Sobelman, 1987). The SAS has been utilized in a number of accounting research studies (e.g., see Smith et al., 2006; Smith et al., 1998). Factor analysis of the SAS on a large dataset (Smith et al., 1993) indicated the presence of two underlying dimensions, psychological discord (13-items, a=.91) and relaxation, (4-items, a=.88). The former was defined as ‘‘the state of emotional distress experienced as a result of cognitive interpretation of environmental events,’’ and the latter as ‘‘a state of cognitive-affective and psycho-physiological
Contribution of Stress Arousal to the Beyond the Role Stress Model
135
homeostasis, i.e., the lack of extraordinary arousal’’ (p. 138). Smith et al. (1998) replicated these results. The present study measured stress arousal using the 17-item version of the SAS. Item responses were made on 4-point Likert type scales ranging from ‘‘seldom or never’’ (1) to ‘‘almost always’’ (4). Table 1 presents the items comprising each dimension of the SAS and the MROB version of the MBI, respectively. This presentation is intended to illustrate the potential conceptual distinctiveness of the underlying dimensions tapped by the two scales, empirical support for which appears in the Results section below.4
Analysis The measures utilized in this study have demonstrated reliability and validity in prior research as noted above. However, there appeared to be a conceptual incongruity with respect to the job satisfaction scale. As noted above, Fogarty et al. (2000, p. 42) referred to the multi-dimensional nature of the job satisfaction scale, yet their reported results (49, 53) indicate that the scale items tapped a unidimensional construct. In order to better assess the dimensionality of this scale, a random holdout sample of 149 responses was generated from the full sample using a random sample generation procedure in Systat. Using this sub-sample, an iterated principal components factor analysis of the 27 items comprising the job satisfaction scale was conducted. Using a minimum eigenvalue greater than one and factor loadings of .5 or higher as criteria (Nunnally, 1978), four factors emerged from the job satisfaction scale. Table 2 reports the items loading on each factor. These factors were incorporated into the analysis of the final sample data as reported below. The stress arousal, burnout, and job satisfaction scales are multidimensional in nature, whereas the role stressor measures, as well as the performance and turnover intentions measures are unidimensional. Initially the items on the unidimensional scales were combined into two composite indicator variables for their respective constructs using the matched composites procedure described by Bentler and Wu (1995, pp. 201–202). This procedure is appropriate when there is no expectation that any of the composites created would be different from one another, and ‘‘each composite should measure the same construct, or combination of constructs, as measured by a single composite of all the original scores’’ (Bentler & Wu, 1995, p. 201). This procedure facilitates the development of a latent variable
136
KENNETH J. SMITH ET AL.
Table 1. Construct
Stress Arousal and Burnout Scale Items. Dimension – Items
Stress arousala
Within the last few weeks, how often have you found yourself y Psychological discord Repeating unpleasant thoughts? Anticipating or remembering unpleasant things? Feeling sad or depressed? Thinking about things which upset you? Upset? Irritable? Feeling frustrated? Preoccupied with recurring thoughts? Concerned or worried? Feeling tense? Having difficulty adjusting or just coping? Having difficulty relaxing? Annoyed? Relaxation Feeling relaxed? (R)b Feeling peaceful? (R) Feeling calm? (R) Feeling satisfied? (R)
Burnoutc
Please indicate the extent to which each statement accurately describes how you feel about your job y Depersonalization I feel I treat some clients as if they were impersonal ‘‘objects.’’ I feel indifferent toward some of my clients. I feel a lack of personal concern for my supervisor. I feel I’m becoming more hardened toward my supervisor. I feel I have become callous toward my coworkers. I feel insensitive toward my coworkers. I feel I am becoming less sympathetic toward top management. I feel alienated from top management. Reduced personal accomplishment I feel I perform effectively to meet the needs of my clients. (R) I feel effective in solving the problems of my clients. (R) I feel I am an important asset to my supervisor. (R) I feel my supervisor values my contribution to the firm. (R) I feel my coworkers truly value my assistance. (R) I feel I am a positive influence on my coworkers. (R) I feel I satisfy many of the demands set by top management. (R) I feel I make a positive contribution toward top management goals. (R) Emotional exhaustion Working with clients is really a strain for me. I feel I am working too hard for my clients.
Contribution of Stress Arousal to the Beyond the Role Stress Model
137
Table 1. (Continued ) Construct
Dimension – Items Working with my boss directly puts too much stress on me. I feel emotionally drained by the pressure my boss puts on me. I feel frustrated because of working directly with coworkers. I feel I work too hard trying to satisfy coworkers. I feel dismayed by the actions of top management. I feel burned out from trying to meet top management’s expectations.
a
4-point Likert scale ranging from 1=seldom or never to 4=almost always. (R)=reverse scored-item. c 6-point Likert scale ranging from 1=very much unlike me to 6=very much like me. b
model by providing multiple indicators for each construct (Anderson & Gerbing, 1988). This allows for a better estimate of the random error associated with the respective constructs. Random error is taken into account when estimating paths from constructs to indicator variables as well as within the structural model. Statistical under-identification problems related to the composite measures for the performance and role overload measures necessitated that the items on each of these scales be combined into three, rather than two, composite indicator variables.5 In order to further test for construct and discriminant validity among the constructs represented by the measures, confirmatory factor analysis was conducted on the remaining sample of 414 cases. In effect, this tested whether the factors would load on their respective underlying theoretical constructs (Anderson & Gerbing, 1988). Latent variables were constructed for each of the eight factors under review.6 The complete measurement model was tested using the elliptical estimation procedure in EQS Version 6.1 (Multivariate Software Inc., 2004). Anderson and Gerbing (1988) note that assessment of the measurement model must precede the test of structural linkages. EQS structural modeling analyses were then conducted. The first analysis tested and evaluated Paths A through O in Fig. 1 as a direct replication of the Fogarty et al. (2000) model. The second analysis tested and evaluated the complete theoretical model illustrated in Fig. 1. Then, in each analysis, statistically non-significant parameters were dropped based on the output of Wald tests (Bentler, 1995) applied to each model.7 Model fit was assessed using a variety of fit measures outlined by Bentler (1990). These include the goodness-of-fit chi-square, the Normed Fit Index (NFI), the Non-Normed Fit Index (NNFI), the Comparative Fit Index (CFI), the LISREL Goodness
138
KENNETH J. SMITH ET AL.
Table 2.
Factor Loadings for Job Satisfaction Survey Scale Items.
Construct 1. Recognition (m=3.685, aa=.954) How do you feel y With the amount of recognition and respect that I receive for my work? With the amount of respect that I receive for my work? With the extent to which I am recognized for my work? With the degree to which my work is perceived to be important by the company?
Loading
.703 .658 .647 .571
2. Family (m=4.166, a=.953) How do you feel y With the amount of consideration that my family gives me while on my job? With the support my family gives me? With the attitude of my family towards my job?
.898 .803
3. Boss (m=3.597, a=.907) How do you feel y With the way my boss helps me to achieve my immediate goals? With my boss’s ability to lead me and my immediate colleagues? With the considerate and sympathetic nature of my immediate boss? With the technical competence of my immediate boss?
.824 .793 .667 .630
4. Perks (m=3.828, a=.884) How do you feel y With the opportunity for acquiring higher skills? With the amount of compensation that I receive? With the extent to which I am fairly paid for what I contribute? With the opportunity in my job to achieve excellence in my work? With the chance for future promotion I have in my job? With the nature of the work that I do in my job?
.742 .720 .675 .673 .656 .606
.911
a
This is the Cronbach’s alpha reliability computed to index the internal consistency of the measure. Values exceeding .70 are considered satisfactory (Nunnally, 1978).
of Fit Index (GFI), the Root Mean Square Residual (RMR), the Average Off Diagonal Squared Residual (AOSR), and the Root Mean Square of Approximation (RMSEA). Numerous measures were used to assess model fit as no one measure is definitive (Fogarty et al., 2000, pp. 44–45). As a final evaluation of the Fig. 1 model, an a priori sequence of nested models was developed to compare and test against the reduced theoretical model. The first sequential model constrained the path from Stress Arousal to Satisfaction to 0. The second sequential model constrained the path from Stress Arousal to Performance to 0. Finally, the third sequential model
Contribution of Stress Arousal to the Beyond the Role Stress Model
139
constrained the path from Stress Arousal to Burnout to 0. This nested sequence of models provided direct tests of the hypotheses that Stress Arousal plays a significant mediating role in the relations between role stressors, burnout, and key job outcomes. Sequential chi-square difference tests (SCDT) were used to compare the nested structural models (Anderson & Gerbing, 1988). The SCDT test generates a statistic representing the chi-square difference between two alternative models relative to the difference in degrees of freedom between the two models. A significant chi-square difference value indicates a significant loss of fit by constraining a path to 0, and the path should be retained in the model (James, Mulaik, & Brett, 1982). A non-significant value indicates acceptance of the more parsimonious (i.e., more constrained) of the nested models.
RESULTS Descriptive Statistics Table 3 reports the correlations among the factors under study. Stress arousal correlates positively with the role stressors, burnout, and turnover intentions, and negatively with job satisfaction and performance. The role stressors correlate positively with stress arousal, burnout, and turnover intentions, and negatively with job satisfaction. Burnout correlates positively with turnover intentions and negatively with job satisfaction. Table 3. Factor
Factor Correlations.
Role Stressors RO
RA
Stress Burnout Arousal RC
Role overload (RO) 1.000 Role ambiguity (RA) .161 1.000 Role conflict (RC) .570 .430 1.000 Stress arousal .479 .288 .546 Burnout .417 .379 .557 Job satisfaction (JS) .203 .678 .565 Job performance (JP) .001 .046 .100 Turnover intent (TI) .400 .320 .400 Correlation is not significant at pr.05.
Outcomes JS
1.000 .578 .448 .209 .444
1.000 .537 .107 .569
JP
TI
1.000 .159 1.000 .615 .101 1.000
140
KENNETH J. SMITH ET AL.
These significant correlations suggest that stress arousal and burnout, as well as the other measures included in this study, are viable measures for evaluating the predicted relationships to be tested. Table 4 reports the reliability coefficients, mean scores, and standard deviations of each of the measurement constructs for the final sample. The reliability coefficients ranged from .761 (role conflict) to .957 (turnover intentions), all of which exceed the .700 minimum threshold suggested by Nunnally (1978) as sufficient to demonstrate the internal consistency of each measure.
Table 4. Measurement Characteristics for Study Constructs Description, Number of Items, Reliability (a), Means (m), and Standard Deviations (s). Measure (Indicators) Stress arousal dimensions Discord Relaxationb
Scale Description
Items (n)
aa
m
s
4-point Likert scale ranging from 1=seldom or never to 4=almost always
13 4
.935 .877
1.866 2.583
.578 .669
Burnout tendencies dimensions Emotional exhaustion 6-point Likert scale ranging Reduced personal from 1=very much unlike accomplishmentb me to 6=very much like me Depersonalization
8 8
.857 .852
2.366 2.052
.890 .625
8
.873
2.212
.883
Role stressors Role conflict Role ambiguity Role overload
3 3 5
.761 .763 .873
2.827 2.487 3.030
.841 .813 .917
4 4 6 3 6
.933 .887 .837 .923 .811
3.741 3.546 3.934 4.266 4.167
.962 1.030 .697 .770 .582
3
.957
2.298
1.260
5-point Likert scale ranging from 1=strongly disagree to 5=strongly agree
Job outcomes Job satisfaction dimensions Recognition 5-point Likert scale ranging from 1=strongly disagree to Boss 5=strongly agree Perks Family Job performance 5-point Likert scale ranging from 1=bottom 10% to 6=top 10% Turnover intentions 5-point Likert scale ranging from 1=strongly disagree to 5=strongly agree a
Cronbach’s alpha reliability coefficient. Values exceeding .70 are considered satisfactory (Nunnally, 1978). b The items on this dimension are reverse-scored.
Contribution of Stress Arousal to the Beyond the Role Stress Model
141
Measurement Model Tests Table 5 reports the results from the measurement model tests. As indicated in Panel A, all of the path coefficients from latent constructs to their manifest indicators were significant at po.05. The fit indices reported in Panel B indicate good model fit, as their values are each above the minimum threshold of .900. In addition, the AORS value of .043, the RMR value of .054, and the RMSEA of .053 with its tight 95% confidence interval (.046–.061) fall within their respective standards for acceptance. The nested measurement model comparison reported in Panel C supports Smith et al.’s (2006) finding that stress arousal and burnout are conceptually distinct constructs. As indicated, the model which constrained stress arousal and burnout to load on one underlying factor demonstrated a significant loss of fit in comparison to the reduced theoretical model, with a w2diff of 146.657 (df=6). Replication of the Fogarty et al. (2000) Model Table 6 provides goodness of fit statistics for the test of the Fogarty et al. (2000) model, i.e., stress arousal is not included. All of the fit indices reported in Panel A exceed the minimum threshold of .90. In addition, all of the residual analysis indicators are within their respective standards for acceptance. Panel B presents a comparison of the path coefficients between the replication and the original models. While the fit for the replication model is quite good, there are several differences between the two models. Eight of the fourteen paths are not consistent with respect to significance across the two models. The paths from role conflict to job satisfaction, performance and turnover intentions are significant in the replication model but not the original model. The same is true for role ambiguity to turnover intentions. In turn, the paths from role ambiguity and role overload to performance and the paths from burnout to job satisfaction and performance are not significant in the replication, but are significant in the Fogarty et al. (2000) model.
Theoretical Model Tests Inclusion of both the stress arousal and burnout constructs will arguably provide a more complete understanding of the stress process. The theoretical
142
KENNETH J. SMITH ET AL.
Table 5.
Results of Measurement Model Tests.
Panel A: Standardized measurement coefficients for the construct indicators
Role stressors Role conflict – RC1 Role conflict – RC2 Role ambiguity – RA1 Role ambiguity – RA2 Role overload – RO1 Role overload – RO2 Role overload – RO3 Stress arousal Discord Relaxation Burnout Burnout – emotional exhaustion Burnout – reduced personal achievement Burnout – depersonalization Outcomes Job satisfaction Recognition Boss Perks Family Turnover intentions – TI1 Turnover intentions – TI12 Performance – JP1 Performance – JP2 Performance – JP3 Panel B: Goodness of fit summaryc,
Standardized Coefficient
t-valuea
.751 .709 .867 .806 .884 .857 .866
–b 11.038 –b 13.336 –b 22.265 22.518
.794 .759
–b 11.430
.870 .738 .812
18.223 15.569 –b
.816 .562 .703 .362 .972 .984 .848 .877 .751
–b 10.760 13.568 6.781 –b 43.321 –b 18.312 16.432
d
Result
Statistical tests w2 df p-value w2/df Fit indices NFI NNFI CFI GFI
361 167 .00 2.16 .928 .949 .960 .924
Standard for Acceptance
NA NA >.05 o2.00 >.900 >.900 >.900 >.900
Contribution of Stress Arousal to the Beyond the Role Stress Model
143
Table 5. (Continued ) Panel B: Goodness of fit summaryc,
d
Residual analysis RMR AOSR RMSEA 95% confidence level
Result
Standard for Acceptance
.054 .043 .053 (.046–.061)
o.05 o.05 o.10 NA
Panel C: Nested measurement model comparison Model 1. Measurement model 2. Stress and burnout constrained to load on one underlying factor
w2
df
w2/df
w2diff
361.235 507.892
167 173
2.163 2.936
146.657
Notes: NFI=normed fit index. Higher values indicate better fit; NNFI=non-normed fit index. Higher values indicate better fit; CFI=comparative fit index. Higher values indicate better fit; GFI=goodness of fit index. Higher values indicate better fit; AOSR=average off diagonal squared residual. Lower values indicate better fit; RMR=root mean square residual. Lower values indicate better fit; RMSEA=root mean squared error of approximation. Lower values indicate better fit. a Each of the reported t-values is significant @ po.05. b Structural equations modeling procedures require that one measure of each construct be fixed to 1.0 to establish the scale of the latent construct. c The measurement model reflects the release of six factor covariances as determined by examination of the multivariate Wald test output from the test of the full model. The dropped covariances were: (1) satisfaction – performance; (2) burnout – performance; (3) overload – performance; (4) ambiguity – performance; (5) turnover – performance; (6) conflict – performance. By dropping these covariances, the degrees of freedom increased from 271 for the full model to 277 for the reduced model. d The Wald test is a post-hoc procedure that capitalizes on a particular sample, i.e., it is not theory-driven. Replication with another sample is needed to determine whether the relations reported herein hold. pr.001
model developed above does just that. Table 7 provides goodness-of-fit statistics for the tests of the theoretical model and the sequence of nested structural models. The fit indices reported in Panel A indicate a good model fit, as their values are each above the minimum threshold of .900.
144
KENNETH J. SMITH ET AL.
Table 6.
Results of Replication Tests.
Panel A: Goodness of fit summary for replication of Fogarty et al. (2000) model
Statistical tests w2 df p-value Chi-squarew2/df Fit indices NFI NNFI CFI GFI Residual analysis RMR AOSR RMSEA 95% confidence level Explained variance of dependent variables R2 for burnout R2 for job satisfaction R2 for job performance R2 for turnover intentions
Result
Standard for Acceptance
268 122 .00 2.20
NA NA >.05 o2.00
.938 .956 .965 .932
>.900 >.900 >.900 >.900
.052 .044 .054 (.045–.063)
o.05 o.05 o.10 NA
.368 .681 .013 .413
Panel B: Estimated path coefficients for the replication versus Fogarty et al. (2000) model a Hypothesized Relationship
Replication
Fogarty et al. (2000)
Dependent variable
Standardized coefficient
t-value
Standardized coefficient
t-value
Role conflict Role ambiguity Role overload
Job satisfaction Job satisfaction Job satisfaction
.182 .561*** .482
3.751 4.837 2.175
NS .407 .170
NS .3.882 1.994
Role conflict
Turnover intentions Turnover intentions Turnover intentions
.250
3.628
NS
NS
.172
3.251
NS
NS
NS
NS
NS
1.907 NS
NS .187
NS 1.492
Independent variable
Role ambiguity Role overload Role conflict Role ambiguity
Performance Performance
NS .115 NS
Contribution of Stress Arousal to the Beyond the Role Stress Model
145
Table 6. (Continued ) Panel B: Estimated path coefficients for the replication versus Fogarty et al. (2000) model a Hypothesized Relationship Independent variable Role overload
Dependent variable Performance
Replication
Fogarty et al. (2000)
Standardized coefficient
t-value
NS
NS
.258
2.262
4.426 2.819 2.076
.276 .320 .261
2.254 2.883 2.619
NS NS 5.930
.634 .754 .440
4.114 4.071 2.476
Role conflict Role ambiguity Role overload
Burnout Burnout Burnout
.415 .179 .145
Burnout Burnout Burnout
Job satisfaction Performance Turnover intentions
NS NS .362
Standardized coefficient
t-value
Note: NS=non-significant parameter. a Italicized text and coefficients represent relationships opposite in direction to those predicted by the theoretical model. =po.10. =po.05. =po.01.
In addition, the AOSRl value of .037, the RMR of .049, and the RMSEA of .052 fall within their respective standards for acceptance. At this point, based on the Wald test results, non-significant paths were dropped from the theoretical model. No significant loss of fit occurred. The theoretical model explained more variance for burnout, job performance and turnover intentions than did the replication of the Fogarty model. The theoretical model explained less variance for job satisfaction than that explained in the replication model. Panel B shows the results for the comparison of the nested sequence of models (Anderson & Gerbing, 1988). The model constraining the path from stress arousal to satisfaction to 0 demonstrated a significant loss of fit in comparison to the reduced theoretical model, with a w2diff of 4.035 (df=1). The next model constrained the path from stress arousal to performance to 0. The w2diff test again indicates a significant loss of fit (w2diff ¼ 15:818; df=1). The final model constrained the path from stress arousal to burnout to 0, also resulting in a significant w2diff of 22.364; df=1. These differences indicate that the paths from stress arousal to satisfaction, performance and burnout must remain in the model.
146
KENNETH J. SMITH ET AL.
Table 7.
Results of Theoretical Model Tests.
Panel A: Goodness of fit summary Result
Standard for Acceptance
Statistical tests 357 w2 df 171 p-value .00 w2/df 2.09 Fit indices NFI .929 NNFI .952 CFI .961 GFI .925 Residual analysis RM .049 AOSR .037 RMSEA .052 95% confidence level (.044–.059) Explained variance of dependent variables R2 for stress arousal .359 .442 R2 for burnout R2 for job satisfaction .605 R2 for job performance .062 .466 R2 for turnover intentions
NA NA >.05 o2.0 >.900 >.900 >.900 >.900 o.05 o.05 o.10 NA
Panel B: Nested model comparisons Model 1. 2. 3. 4.
Final theoretical modela Path from stress to satisfaction constrained to 0 Path from stress to performance constrained to 0 Path from stress to burnout constrained to 0
w2
df
w2/df
w2diff
357.724 361.759 373.542 380.088
171 172 172 172
2.092 2.103 2.172 2.210
4.035 15.818 22.364
Notes: NFI=normed fit index. Higher values indicate better fit; NNFI=non-normed fit index. Higher values indicate better fit; CFI=comparative fit index. Higher values indicate better fit; GFI=goodness of fit index. Higher values indicate better fit; AOSR=average off diagonal squared residual. Lower values indicate better fit; RMR=root mean square residual. Lower values indicate better fit; RMSEA=root mean squared error of approximately lower values indicate better fit. a The final theoretical model reflects the release of non-significant parameter estimates as determined by examination of the multivariate Wald test output from the test of the model containing the initial hypothesized paths. pr.05. pr.001.
Contribution of Stress Arousal to the Beyond the Role Stress Model
Table 8.
147
Structural Equations Results and Estimated Coefficients for the Hypothesized Modela,b.
Hypothesized Relationship Independent variable
Dependent variable
Role conflict Role ambiguity Role overload Role conflict Role ambiguity Role overload Stress arousal Stress arousal Stress arousal Stress arousal Burnout Burnout Burnout Role conflict Role conflict Role conflict Role ambiguity Role ambiguity Role ambiguity Role overload Role overload Role overload Job satisfaction Job satisfaction
Stress arousal Stress arousal Stress arousal Burnout Burnout Burnout Burnout Job satisfaction Performance Turnover intentions Job satisfaction Performance Turnover intentions Job satisfaction Performance Turnover intentions Job satisfaction Performance Turnover intentions Job satisfaction Performance Turnover intentions Performance Turnover intentions
Standard Coefficient
t-value
Probability
.437 .083 .239 .306 .140 .083 .372 .153 .285 .047 .190 .031 .293 .290 .021 .028 .461 .104 .039 .194 .140 .115 .165 .431
4.912 1.253 3.205 3.620 2.385 1.253 4.976 2.021 3.903 .699 2.710 .360 4.995 2.880 .182 .317 7.160 1.152 .565 2.924 2.097 2.519 1.469 7.403
po.01 NS po.01 po.01 po.05 NS po.01 po.05 po.01 NS po.01 NS po.01 po.01 NS NS po.01 NS NS po.01 po.05 po.05 NS po.01
Note: NS=non-significant parameter. a The italicized lines of information represent statistically significant paths. b Bold, italicized paths represent relationships opposite in direction to those predicted by the theoretical model.
Table 8 presents the estimated maximum likelihood structural coefficients and significance test results for each of the hypothesized paths. Fifteen of the 24 paths are significant. The significant paths are illustrated in Fig. 2. As predicted, role conflict and role overload are positively related to stress arousal with estimated coefficients of .437 and .239, respectively. In turn, stress arousal is positively related to burnout (.372), and negatively related to job satisfaction (.153) and performance (.285). As posited, role conflict and role ambiguity had significant positive relations with burnout, with estimated coefficients of .306 and .140,
148
KENNETH J. SMITH ET AL.
.194*** Role Overload
-.290*** Job Satisfaction -.461***
-.153** .306***
.422*
.239***
Role Conflict
.437***
Stress Arousal
.372***
Burnout
.293***
.115**
Turnover Intentions
-.285***
.267* Role Ambiguity
-.431***
-.190***
.131*
.140** .140**
Performance
1
Paths between each latent construct and its indicators are omitted for ease of diagramming and interpretability (see Table 4 for these relations). 2 For ease of reading, one line is drawn from each of the role stressors to burnout and job outcomes. The main effects for each stressor branch off from their respective lines. 3 Italicized path coefficients represent relationships opposite to those predicted by the theoretical model (i.e., overload to satisfaction, and overload to performance). * covariance between independent factors (all significant @ p<.05); **significant @ p<.01; ***significant @p<.001
Fig. 2.
Expanded Beyond the Role Stress Model Standardized Path Coefficients.1–3
respectively. Burnout, in turn, had a direct negative relation to job satisfaction (.190) and a direct positive relation to turnover intentions (.293). Also as predicted, each of the role stressors had a significant relation with job satisfaction with standardized coefficients of .290, .461, and .194, for role conflict, role ambiguity, and role overload, respectively. The positive relation between role overload and job satisfaction, while in the opposite direction as predicted, is consistent with Fogarty et al.’s (2000) findings. They also report a positive relationship between these two constructs. Only role overload had a significant relationship with turnover intentions (.115) and performance (.140), the latter being in the opposite direction as predicted. Again, the positive relationship between role overload and performance is consistent with the Fogarty et al.’s (2000) results. The counterintuitive positive relations between role overload and job satisfaction and between role overload and performance may result from overload being perceived as a challenge rather than a threat, and individuals rose to that challenge (LePine et al., 2005).
Contribution of Stress Arousal to the Beyond the Role Stress Model
149
Finally, contrary to expectations, job satisfaction was not significantly related to performance. This finding is not wholly unexpected. Research has historically failed to clarify the relationship between job satisfaction and performance (Iaffaldono & Muchinsky, 1985). However, as predicted, job satisfaction had a significant negative influence on turnover intentions (.431).
DISCUSSION This study’s results support the propositions that: (1) stress arousal and burnout are conceptually distinct constructs; and, (2) both stress arousal and burnout serve as important mediators of the relations between role stressors and the key outcomes of job satisfaction, performance, and turnover intentions. Replication The replication of the Fogarty model supported 6 of the 15 paths posited in the original model. Fogarty and his colleagues only supported 10 of the 15 paths. The relationships between the three role stressors and burnout are consistent across both studies. The only significant relationship between burnout and the outcome variables in the replication was with turnover intentions, while burnout was related to all the three outcome variables in Fogarty’s analysis. On the other hand, the replication supported paths between role conflict and job satisfaction, turnover intentions and performance. Role ambiguity was also related to turnover intentions in the replication. None of these relations were supported by Fogarty et al. (2000). These differences may be explained by analytical differences between the two studies. A major difference lies in how the multiple indicators for job satisfaction were determined. Rather than randomly split this multidimensional scale into two indicators as Fogarty and his colleagues reported, an exploratory factor analysis was conducted using an independent sample and Nunnally’s (1978) criteria for factor identification. The results show four clean factors embedded in the scale. These four factors were used as indicators for the latent construct, job satisfaction. This alone would reduce the biasing effects of random measurement error by (1) increasing the number of indicators for the construct which allows for better estimates of error variance; (2) grouping the items based on commonality; and (3) including only the items that were clearly related to/components of job satisfaction (only 17 of the original questions were used). In addition, the role overload and performance measures were divided into three composite
150
KENNETH J. SMITH ET AL.
indicators for their respective constructs in order to overcome underidentification problems. This again increased the number of indicator variables and better allowed for the estimation of error variance. Random measurement error effects can be to ‘‘attenuate estimates of coefficients, make the estimate of zero coefficients nonzero or yield coefficients with the wrong sign’’ (Williams & Hazer, 1986, p. 221). By reducing the amount of random measurement error (as with cleaner, more well defined job satisfaction factors) and by being better able to estimate random error effects using a greater number of indicators for several constructs (Anderson & Gerbing, 1988), it is not surprising that there are differences between the original results and the results of this study. Another factor that may help account for these differences is sample size. The sample in this study is more than twice as large as that used by Fogarty and his colleagues. Larger sample sizes provide more trustworthy z-tests on parameter significance (Bentler & Wu, 1995, p. 6). Combined, these arguments suggest the replication results provide a better test of the model presented by Fogarty et al. (2000). Theoretical Model As reported above, the role stressors had significant direct relationships with both stress arousal and burnout, with two exceptions. Significant relationships were not found between role overload and burnout or between role ambiguity and stress arousal. While not having a direct influence, role overload did impact burnout indirectly through stress arousal (stress arousal mediated the relationship between role overload and burnout). The findings for role ambiguity provide some support for Schaubroeck, Cotton, and Jennings (1989) proposition that the significant correlation of role stressors with one another may attenuate an otherwise significant finding. Role overload and role conflict were correlated at .57, while role conflict and role ambiguity were correlated at .43. As a result of these rather strong correlations, a significant relationship between role ambiguity and stress arousal may have been attenuated. There is support for this argument in the findings reported by Smith et al. (1995). Using the same measures, they reported correlations of .36 for role ambiguity and role conflict and .31 for role conflict and role overload. In that study, role ambiguity had a significant effect on stress arousal, but the relationship between role conflict and stress arousal was not significant. Combined, the results of these two studies and those of Schaubroeck and his colleagues support the argument that estimates of the effects of role conflict and role ambiguity are affected by the
Contribution of Stress Arousal to the Beyond the Role Stress Model
151
covariance between them (Schaubroeck et al., 1989). With respect to the direct influence of role stressors on job outcomes after controlling for stress arousal and burnout, the results were mixed. As noted above, Fogarty et al. (2000, pp. 39–40) predicted that after controlling for burnout, the direct effects of role stressors on the key job outcomes would be marginal or nonsignificant. However, they reported (p. 51) significant relations between role ambiguity and job satisfaction (), and between role overload and both job satisfaction (+) and performance (+). These findings were replicated in the theoretical model. These direct effects support the argument that ‘‘direct effects between role stressors and job outcomes may persist when role stressors are not of sufficient magnitude to foster burnout’’ (Fogarty et al., 2000, p. 39). The positive direct relations between role overload and job satisfaction and performance may be the result of individuals evaluating overload as a challenge rather than a threat. Challenge stressors are viewed as having the potential to promote personal gain and growth (LePine et al., 2005). Fogarty et al. (2000) refer to this as the ‘‘eustress’’ component (p. 61) of role overload. In this study, this component was larger for satisfaction (.194) and performance (.140) than the distress component. It is only when overload is seen as a threat that it will trigger stress arousal and, if it persists, burnout. The negative indirect effects, through stress arousal, represent the distress component of overload (p. 60) and, in this study, is quite small for satisfaction (.037) and performance (.068). These results suggest that managers/supervisors may want to strike a balance, keeping overload reasonably high to maintain its motivating effects while minimizing the dysfunctional effects (Fogarty et al., 2000; LePine et al., 2005). Conceptually we have proposed that the effect of multiple role stressors is best addressed initially through stress arousal. For, as Lazarus and Folkman (1984) proposed, a stressor must be perceived by an individual as threatening for stress arousal to occur. Stress arousal, in turn, has been shown to lead to a number of deleterious personal and organizational consequences. These results lend support to this conceptualization. First, two of the three role stressors had a significant positive influence on stress arousal as reported above. Furthermore, stress arousal was the most significant direct positive influence on burnout and a significant mediator between conflict and overload on burnout. For example, the total effects of role conflict on burnout equals the direct effect between the two plus the product of the indirect effects (role conflict to stress arousal and stress arousal to burnout). The total effects=.469 while the indirect effects alone=.163 (path coefficients from Table 8 were used to make these estimates). Thus the indirect or mediating effects through stress arousal account for 35% of
152
KENNETH J. SMITH ET AL.
the total effects (.163/.469). The same process applied to the total effects of role conflict on job satisfaction indicates the indirect or mediating effects of stress arousal and burnout total approximately 10% of the total effects. Both results support the importance of including both stress arousal and burnout when attempting to understand the stress process. Finally, stress arousal had a significant direct relation to satisfaction and performance, and an indirect relation to turnover intentions through burnout. Combined, these results indicate stress arousal that makes an important contribution in understanding the relationships between stressors and outcomes. This study’s findings with respect to burnout are generally supportive of the replication of Fogarty et al. (2000) presented above. The direct influence of role conflict and role ambiguity (+) on burnout and the direct influence of burnout on turnover intentions (+) concur with those of the replication findings. The non-significant relation between burnout and performance was replicated. Role conflict (), ambiguity (), and overload (+) have significant relationships with job satisfaction in both the replication and theoretical model results.
LIMITATIONS Like all research using a cross-sectional database and self-report measures, this study has certain potential limitations. Self-report measure utilization might have yielded results that were questionable due to poor instrument design, or subjected the tested relationships to the influence of common methods variance. The instruments utilized were shown to be reliable and valid in prior research, measures were selected to minimize item overlap, and the confirmatory factor analysis supported the theoretical meaningfulness of the latent constructs uncovered in the data reduction (i.e., exploratory factor analytic) phase of the study (see, Bollen, 1989, pp. 226–232 for a description of the relationship between exploratory and confirmatory factor analytic techniques). Using confirmatory factor analysis, it was also possible to test for common method bias. The results indicate that the common method of data collection across variables did not explain the correlations and covariances among the variables. The cross-sectional nature of the data raises the concern that ‘‘states’’ rather than ‘‘traits’’ were measured, thus raising the specter that the data may have also supported other network orderings. Therefore, despite the fact that the proposed model was based on reasonably sound a priori theory and research, the usual caveats for drawing causality inferences apply.
Contribution of Stress Arousal to the Beyond the Role Stress Model
153
Another possible concern is the measures used for performance. These measures were not specifically designed to measure performance by accountants. These measures were used in this study in order to replicate Fogarty et al. (2000). As discussed above, there are positive relations between these measures and satisfaction and performance. While seemingly counterintuitive, these results do have some theoretical support. A closer look at the individual items used suggests that three of the six items may not apply to accountants. These items ask respondents to rate themselves in terms of: (1) Your performance in regard to customer relations; (2) Your performance in regard to management of time, planning ability, and management of expenses; and, (3) Your performance in regard to knowledge of your products, company, competitors’ products, and customer needs. As indicated, the second item asks respondents to assess their performance in three areas. It is intuitive that accountants need both time management and planning skills to complete their work in a timely manner. Expense management is also critically important in an increasingly competitive market for accounting and auditing services. However, not all respondents may be directly responsible for expense management. The third item also asks respondents to assess their performance with regard to several items. While the term ‘‘products’’ is not necessarily the one most typically applied in a service setting, it is in the vernacular of the accounting and financial services professions, particularly with regard to service offerings. The issue is thus whether the respondents able to apply these questions to their jobs in a meaningful way and answer in a consistent manner. When reviewing the means, standard deviations, and Cronbach alpha statistic for the six questions making up the performance scale, the means were quite consistent, ranging from 3.977 to 4.367 across the items. In addition, the standard deviations are also well within reason ranging from .784 to .835. Alpha for the scale is .811. Dropping any of the items would reduce the total alpha. These results indicate the respondents answered consistently. They also support that respondents were able to apply these questions to their jobs in a meaningful way. However, the development of performance measures specifically tailored to the public accounting work environment would be a worthy endeavor. A final concern relates to the self-containment of the model’s explicit structural equations (see Kemery, Bedeian, Mossholder, & Touliatos, 1985, p. 373). When all relevant dependent (i.e., endogenous) variable determinants are measured, a structural model is considered to be self-contained (James et al., 1982). Practical constraints precluded us (as well as others) from including all known and/or suspected determinants of the constructs
154
KENNETH J. SMITH ET AL.
examined in the present analyses. To the extent that unexplained variance in the model is due to correlated omitted variables (i.e., non-random influences), the consequence of these omissions is biased estimates of the structural parameters relating the stressor constructs and each of the dependent variables under investigation. For example, observed effects of role conflict and role ambiguity have been reported to depend on common antecedent variables (Schaubroeck et al., 1989). The current study does not include possible antecedents of these role stressors. Identification and measurement of additional influences on stress arousal and burnout and their consequences (e.g., organizational and professional commitment) may also add to the refinement of the present model and thus should be investigated.
CONCLUSION As stated at the outset, for over two decades researchers have examined the antecedents and consequences of purported ‘‘stress’’ measures among accountants in a variety of occupational settings. Early studies focused on the stressor-to-outcome linkage with little regard to the mediating processes through which individuals react to sources of stress (Eden, 1990). Later studies, while recognizing the need to examine potential mediating influences in the stressor-to-outcome dynamic, appear to have taken divergent paths; one group of studies has examined the role of burnout in this process, while another stream of research has examined the role of stress arousal. While these studies have shown each construct to play a significant role in the stressor-to-outcome dynamic, they have not considered the potential explanatory power enhancement of specifications that further expand the traditional role stress model to include the simultaneous influence of these two stress constructs. The conceptual and empirical distinction of the stress arousal and burnout constructs strongly supported the present study effort to further refine the Beyond the Role Stress Model. These results support the admonition by Fogarty et al. (2000, p. 62) that ‘‘academic accountants who seek to describe the important behavioral dimensions of stress in the workplace should not ignore the role played by burnout.’’ Nor should the role of stress arousal be ignored. Each construct significantly contributes to the stress dynamic and deserves consideration in future efforts that attempt to further refine and enhance the traditional role stress model. This study’s findings also have practical implications for accounting organizations and individuals whose primary interest lies in reducing the
Contribution of Stress Arousal to the Beyond the Role Stress Model
155
cost of excessive stress in the workplace. Clinicians have advised that greatest prospects for success in mitigating the deleterious consequences of excessive stress are achieved from intervening as early in the causal process as possible (Beck, Rush, & Shaw, 1979; Bryant, Sacksville, Dang, Moulds, & Guthrie, 1999). Ideally, to operationalize this prescription all stress interventions should be targeting the occupational milieu. However, there are some aspects of almost all jobs that are inherently stressful. Therefore, the best way to build stress resistance and to foster resiliency may be to target the cognitive affective domain, more specifically, the cognitive affective indicators of acute stress arousal. For, cognitive affective arousal has long been recognized as a precursor to the physiological stress response (Selye, 1956; Lazarus & Folkman, 1984; Bandura, 1997; see also Everly & Lating, 2002 for a review). The present study lends support to this interventional tact and should be considered when designing stress management programs for accountants and especially in the training of accounting managers. Indeed, if interventions targeted at the stage of arousal can be implemented, there is a greater likelihood of mitigating undesirable consequences such as burnout, job dissatisfaction, lower performance, and increased turnover.
NOTES 1. Stout and Rebele (1996, pp. 5–7) argue favorably for studies that replicate and/ or extend prior research. 2. The response rate was comparable with that cited for other national studies using trade association membership and with other studies that used AICPA membership lists (Fogarty et al., 2000, p. 43.) 3. The three excluded items asked respondents, ‘‘Within the last few weeks, how often have you found yourself: (1) under a great deal of pressure; (2) pushed close to your limit; and, (3) feeling very tired or run down.’’ These items were excluded from consideration at the outset of the study as they were not found to load significantly on either the psychological discord or relaxation factors in the Smith et al. (1993, 1998) studies, and they did not significantly contribute to the internal consistency reliability or construct validity of the scale. 4. The four items comprising the relaxation dimension of the Stress Arousal Scale and the eight items comprising the reduced personal accomplishment dimension of the Burnout Scale are reverse-scored. Item reversal is a commonly used psychometric construction technique used to guard against response patterning (see Millon, 1987). Though not reported, the reverse-scored items on each scale correlated significantly with the other dimensions on their respective scales. Our use of these scales and their scoring also reflects their use in prior research. Everly et al. (1989) and Leiter and
156
KENNETH J. SMITH ET AL.
Maslach (1988) provide additional explanation for the construction of the Stress Arousal Scale and MROB version of the Maslach Burnout Inventory, respectively. 5. These problems were revealed in the preliminary measurement model test output, thus prompting the adjustment. 6. The eight tested factors were role conflict, role ambiguity, role overload, stress arousal, burnout, job satisfaction, performance, and turnover intentions. 7. The Wald test is a post-hoc procedure that capitalizes on a particular sample, i.e., it is not theory-driven. Replication with another sample is needed to determine whether the relations reported herein hold.
ACKNOWLEDGMENTS This research was supported in part by the American Institute of Certified Public Accountants.
REFERENCES Almer, E., & Kaplan, S. (2002). The effects of flexible work arrangements on stressors, burnout, and behavioral job outcomes in public accounting. Behavioral Research in Accounting, 14, 1–34. Anderson, J., & Gerbing, D. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103(3), 411–423. Bandura, A. (1997). Self-efficacy: The exercise of control. New York: Freeman. Beck, A. T., Rush, A. J., & Shaw, B. F. (1979). Cognitive therapy of depression. New York: Guilford. Beehr, T., Welsh, J., & Tabor, T. (1976). Relationship of stress to individually and organizationally valued states: Higher order needs as a moderator. Journal of Applied Psychology, 61(1), 41–47. Bentler, P. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107(2), 238–246. Bentler, P. M. (1995). EQS structural equations program manual. Encino, CA: Multivariate Software, Inc. Bentler, P. M., & Wu, J. C. (1995). EQS for windows user’s guide. Encino, CA: Multivariate Software, Inc. Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley. Bryant, R. A., Sacksville, T., Dang, S., Moulds, M., & Guthrie, R. G. (1999). Treating acute stress disorders: An evaluation of cognitive behavior therapy and supportive counseling technique. American Journal of Psychiatry, 156, 1780–1786. Bullen, M. L., & Flamholtz, E. G. (1985). A theoretical and empirical investigation of job satisfaction and intended turnover in the large CPA firm. Accounting, Organizations and Society, 10(3), 287–302. Churchill, G., Ford, N., Hartley, S., & Walker, O. (1985). The determinants of salesperson performance: A meta-analysis. Journal of Marketing Research, 22, 103–118.
Contribution of Stress Arousal to the Beyond the Role Stress Model
157
Collins, K., & Killough, L. (1992). An empirical examination of stress in public accounting. Accounting, Organizations and Society, 17(6), 535–547. Cordes, C., & Dougherty, T. (1993). A review and an integration of research on job burnout. Academy of Management Review, 18, 621–656. Davy, J., Kinicki, A., & Scheck, C. (1991). Developing and testing a model of survivor responses to layoffs. Journal of Vocational Behavior, 38, 302–317. Donnelly, J., & Ivancevich, J. (1975). Role clarity and the salesman. Journal of Marketing, 39, 71–74. Dubinski, A., & Mattson, B. (1979). Consequences of role conflict and ambiguity experienced by retail salespeople. Journal of Marketing, 55, 70–86. Eden, D. (1990). Acute and chronic job stress, strain, and vacation relief. Organizational Behavior and Human Decision Processes, 45, 175–193. Everly, G. S., & Lating, J. M. (2002). A clinical guide to the treatment of the human stress response. New York: Kluwer. Everly, G. S., Sherman, M., & Smith, K. J. (1989). The development of a scale to assess behavioral health factors: The Everly Stress and Symptom Inventory. In: R. H. Feldman & J. Humphrey (Eds), Health education: Current selected research (Vol. II, pp. 71–86). New York: AMS Press. Everly, G. S., & Sobelman, S. A. (1987). The assessment of the human stress response. New York: AMS Press. Feldman, D., & Weitz, B. (1988). Career plateaus in the salesforce: Understanding and removing blockages to employee growth. Journal of Personal Selling and Sales Management, 8(1), 23–32. Fogarty, J., Singh, J., Rhoads, G., & Moore, R. (2000). Antecedents and consequences of burnout in accounting: Beyond the role stress model. Behavioral Research in Accounting, 12, 31–67. Girdano, D., & Everly, G. S. (1986). Controlling stress and tension (2nd ed.). Englewood Cliffs, NJ: Prentice-Hall, Inc. Iaffaldono, M., & Muchinsky, P. (1985). Job satisfaction and job performance: A meta-analysis. Psychological Bulletin, 97, 251–273. James, L., Mulaik, S., & Brett, J. (1982). Causal analysis. Beverly Hills, CA: Sage. Kemery, E., Bedeian, A., Mossholder, K., & Touliatos, J. (1985). Outcomes of role stress: A multisample constructive replication. Academy of Management Journal, 28(2), 363–375. Lazarus, R. S., & Folkman, S. (1984). Stress, appraisal, and coping. New York: Springer. Leiter, M., & Maslach, C. (1988). The impact of interpersonal environment on burnout and organizational commitment. Journal of Organizational Behavior, 9(3), 297–308. LePine, J., Podsakoff, N., & LePine, M. (2005). A meta-analytic test of the challenge stressorhindrance stressor framework: An explanation for inconsistent relationships among stressors and performance. Academy of Management Journal, 48(5), 764–775. Libby, R. (1983). Comments on Weick. The Accounting Review, 58(2), 370–374. Maslach, C. (1982). Understanding burnout: Definitional issues in analyzing a complex phenomenon. In: W. Payne (Ed.), Job stress and burnout: Research, theory and intervention perspectives (pp. 29–40). Beverly Hills, CA: Sage Publishers. Maslach, C., & Schaufeli, W. (1993). Historical and conceptual development of burnout. In: W. Schaufeli, C. Maslach & T. Marek (Eds), Professional burnout: Recent developments in theory and research (pp. 1–16). Washington, DC: Taylor & Francis. Michaels, C. E., & Spector, P. E. (1982). Causes of employee turnover: A test of the Mobley, Griffeth, Hand, and Meglino model. Journal of Applied Psychology, 67, 53–59.
158
KENNETH J. SMITH ET AL.
Millon, T. (1987). Manual for the MCMI-II. Minneapolis, MN: NCS. Multivariate Software, Inc. (2004). EQS 6.1 – Structural equations modeling software. Encino, CA: MVSOFT. Nunnally, J. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill. Oppenheim, A. (1966). Questionnaire Design and Attitude Measurement (p. 34). New York: Basic Books, Inc. Rizzo, J. R., House, R. J., & Lirtzman, S. I. (1970). Role conflict and role ambiguity in complex organizations. Administrative Science Quarterly, 15, 150–163. Schaubroeck, J., Cotton, J. L., & Jennings, K. R. (1989). Antecedents and consequences of role stress: A covariance structure analysis. Journal of Organizational Behavior, 10, 35–58. Selye, H. (1956). The stress of life. New York: McGraw-Hill. Singh, J., Goolsby, J., & Rhoads, G. (1994). Behavioral and psychological consequences of boundary spanning burnout for customer service representatives. Journal of Marketing Research, 31, 558–569. Smith, K. J., Davy, J. A., & Everly, G. S. (1995). An examination of the antecedents of job dissatisfaction and turnover intentions among CPAs in public accounting. Accounting Enquiries, 5(1), 99–142. Smith, K. J., Davy, J. A., & Everly, G. S. (2006). Stress arousal and burnout: A construct distinctiveness evaluation. Psychological Reports, 99, 396–406. Smith, K. J., Davy, J. A., & Stewart, B. R. (1998). A comparative study of the antecedents and consequences of job dissatisfaction and turnover intentions among women and men in public accounting firms. Advances in Public Interest Accounting, 7, 193–227. Smith, K. J., & Everly, G. S. (1990). An intra and inter-occupational analysis of stress among accounting academicians. Behavioral Research in Accounting, 2, 154–173. Smith, K. J., Everly, G. S., & Johns, T. R. (1993). The role of stress arousal in the dynamics of the stressor-to-illness process among accountants. Contemporary Accounting Research, 9(2), 432–449. Snead, K., & Harrell, A. (1991). The impact of psychological factors on the job satisfaction of senior auditors. Behavioral Research in Accounting, 3, 85–96. Stout, D. E., & Rebele, J. E. (1996). Establishing a research agenda for accounting education. Accounting Education, 1(1), 1–18. Weick, K. E. (1983). Stress in accounting systems. The Accounting Review, 58(2), 350–369. Williams, L., & Hazer, J. (1986). Antecedents and consequences of satisfaction and commitment in turnover models: A reanalysis using latent variable structural equation methods. Journal of Applied Psychology, 71(2), 219–231.
BUDGETARY FAIRNESS, SUPERVISORY TRUST, AND THE PROPENSITY TO CREATE BUDGETARY SLACK: TESTING A SOCIAL EXCHANGE MODEL IN A GOVERNMENT BUDGETING CONTEXT A. Blair Staley and Nace R. Magner ABSTRACT Latent variable structural equation analysis of questionnaire data from 1,358 U.S. federal government unit heads supported a proposed model in which procedural and interactional budgetary fairness reduce managers’ propensity to create budgetary slack by way of enhancing managers’ trust in their immediate supervisor. The results indicate that managers’ perceptions of procedural and interactional budgetary fairness each have a positive influence on their trust in supervisor, which, in turn, has a negative effect on their propensity to create budgetary slack. The direct effects of procedural and interactional budgetary fairness on propensity to create budgetary slack are not significant; rather, trust in supervisor fully mediates these relationships. An important implication of the results is Advances in Accounting Behavioral Research, Volume 10, 159–182 Copyright r 2007 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1016/S1475-1488(07)10007-7
159
160
A. BLAIR STALEY AND NACE R. MAGNER
that budgets may contain less slack, and thus be more effective for purposes such as resource allocation, when organizational officials take steps to promote budgetary fairness. The items composing the budgetary fairness measures provide guidance in this regard.
INTRODUCTION Budgetary slack has been an important focus of research devoted to the design and implementation of effective organizational budgeting systems (e.g., Chow, Cooper, & Waller, 1988; Dunk, 1993; Merchant, 1985; Onsi, 1973; Stevens, 2002; Van der Stede, 2000; Young, 1985). Budgetary slack is created when, as a result of an organization’s budgeting process, ‘‘resources and effort are directed toward activities that cannot be justified easily in terms of their immediate contribution to organizational objectives’’ (Van der Stede, 2000, p. 611). The current paper examines managers’ propensity to create budgetary slack in the context of U.S. federal government budgeting, where the primary role of the budgeting process is to allocate resources among the government’s subunits and where the finalized budget authorizes a maximum level of spending for activities within each unit (Peters, 2004; Wildavsky, 1984). Covaleski, Evans, Luft, and Shields (2003, p. 10) stated that this resource allocation role is one of the essential features of organizational budgeting. During formulation of the President’s proposed budget, U.S. federal government unit heads submit requests for resources to operate their unit to higher government levels (Executive Office of the U.S. President, 2004; Peters, 2004; Wildavsky, 1984). These managers have several incentives to create slack in their unit’s budget by submitting inflated budget requests. Slack resources provide a buffer against the risk that unforeseen future events will have a negative impact on the unit’s ability to provide services (Van der Stede, 2000). Because evaluation of federal government managers’ performance is influenced by the amount and quality of their unit’s services (Senate Bill, 1993, p. 20; USGAO, 2000), managers may view budgetary slack as a means of ensuring they will receive favorable work-related outcomes. For example, the protection that budgetary slack offers against the effects of unforeseen negative events may increase the likelihood that the managers will receive performance-contingent extrinsic rewards such as promotions and pay raises (Merchant & Manzoni, 1989; Van der Stede, 2000); that they will see themselves as winners and avoid the stigma of being
Testing a Social Exchange Model in a Government Budgeting Context
161
underachievers (Merchant & Manzoni, 1989); and that they will gain credibility with other government officials that enables them to sell their ideas, to have resources allocated to their unit in the future, and to forestall intervention in the unit’s operations (Merchant & Manzoni, 1989; Wildavsky, 1984). Creating budgetary slack may also enhance federal government managers’ status and recognition among other parties who have a stake in their unit’s budget (e.g., the unit’s employees and its clientele) because the size of the budgetary allocation is an indicator of the managers’ effectiveness as the unit’s primary advocate in the competition for scarce government resources (Niskanen, 1971). Another reason that federal government managers may seek to build slack into their unit’s budget relates to the egocentric bias, whereby recipients in resource allocations often demand, and feel it is appropriate that they receive, a large share of the resources (Diekmann, Samuels, Ross, & Bazerman, 1997). Leung, Tong, and Ho (2004, p. 407) concluded that the egocentric bias is ‘‘deep seated and amazingly robust’’ across resource allocation settings. Budgetary slack in the form of organizational units receiving more resources than required can have dysfunctional effects on an organization (Nouri, 1994). In a U.S. federal government budgeting context, for example, units whose budget contains slack may be allocated a greater share of the government’s limited resources than can be justified given their contributions to the government’s overall policy objectives, which may have a negative impact on the government’s ability to achieve those objectives because resources are diverted from more effective units (Wildavsky, 1984). The potential negative effects of budgetary slack create a need for studies that identify those aspects of budgeting system design and implementation that influence managers’ propensity to create budgetary slack; Merchant (1985) and Nouri (1994) suggested that such research provides insight into ways of controlling budgetary slack when it is dysfunctional. Budgeting system variables previously found to be related to either the propensity to create budgetary slack or self-rated budgetary slack include budgetary participation (e.g., Govindarajan, 1986; Merchant, 1985; Nouri & Parker, 1996; Onsi, 1973) and a rigid budgetary control style (e.g., Onsi, 1973; Van der Stede, 2000). The purpose of the study reported in this paper is to test a model based on research regarding social exchange and organizational fairness that relates two aspects of a budgeting system – procedural budgetary fairness and interactional budgetary fairness – to managers’ propensity to create budgetary slack. Procedural budgetary fairness, which addresses budgeting system design, is the extent to which an organization’s formal structural
162
A. BLAIR STALEY AND NACE R. MAGNER
budgetary procedures comply with managers’ norms of entitlement or propriety. Interactional budgetary fairness, which addresses budgeting system implementation, is the extent to which budgetary decision makers implement budgetary procedures in a manner that complies with managers’ norms of entitlement or propriety. The model, shown in Fig. 1, proposes specifically that both procedural budgetary fairness and interactional budgetary fairness reduce managers’ propensity to create budgetary slack indirectly through the enhancement of trust in their immediate supervisor. The current study complements research by Little, Magner, and Welker (2002), who found that procedural and interactional budgetary fairness had an interactive effect on propensity to create budgetary slack such that managers had a particularly low propensity to create slack when both the formal budgetary procedures and the immediate supervisor’s implementation of those procedures were fair. Whereas Little et al. showed that procedural and interactional budgetary fairness work together to reduce propensity to create budgetary slack, the current study investigates a possible mechanism by which each form of budgetary fairness reduces propensity to create slack; specifically, by enhancing trust in the immediate supervisor. Latent variable structural equation analysis of questionnaire responses from U.S. federal government unit heads supported the proposed model. The results suggest that both fair formal budgetary procedures and fair implementation of those procedures reduce managers’ propensity to create budgetary slack, and that trust in the immediate supervisor plays a key mediating role in these processes. The measures of procedural and interactional budgetary fairness, in which each item measures a different fairness criterion, provide guidance to organizations and their officials as to specific ways they can enhance budgetary fairness. The remainder of the paper is organized as follows. The second section presents the relevant literature and theory on which the proposed model is
Procedural Budgetary Fairness
+ Trust in Supervisor
Interactional Budgetary Fairness
-
Propensity to Create Budgetary Slack
+
Fig. 1.
Proposed Model Linking Budgetary Fairness to Propensity to Create Budgetary Slack By Way of Enhancing Trust in Supervisor.
Testing a Social Exchange Model in a Government Budgeting Context
163
based and develops the specific hypotheses of the study. The third section explains the research method, and the fourth section describes the data analysis and presents the results. The final section discusses the results.
LITERATURE, THEORY, AND HYPOTHESES DEVELOPMENT Procedural and interactional fairness in organizational decision making have been examined in a number of contexts, including budgeting and related settings such as pay allocation and performance appraisal (see the reviews of the organizational fairness literature by Brockner & Wiesenfeld, 1996; Colquitt, Conlon, Wesson, Porter, & Ng, 2001; Greenberg, 1990). Norms that employees apply in assessing procedural fairness require, for example, that formal organizational decision-making procedures provide them an opportunity to voice their opinions and to appeal unfavorable decisions and that such procedures are made in a consistent fashion and are based on accurate information (Barrett-Howard & Tyler, 1986; Greenberg, 1986; Leventhal, 1980; Magner, Johnson, Sobery, & Welker, 2000; Tyler, 1988). In assessing interactional fairness, employees focus on factors such as whether decision makers treat those affected by their decisions with interpersonal sensitivity and provide clear and adequate explanations for their decisions (Bies & Moag, 1986; Brockner & Wiesenfeld, 1996; Tyler & Bies, 1990). Procedural and interactional fairness have each been found to affect important employee attitudes and behaviors such as job satisfaction, organizational commitment, organizational citizenship behavior, supervisory evaluations, turnover intentions, and performance (see the meta-analytic review of the organizational fairness literature by Colquitt et al., 2001). Scholars such as Konovosky and Pugh (1994) and Masterson, Lewis, Goldman, and Taylor (2000) have used social exchange theory as a basis for explaining the mechanisms by which procedural and interactional fairness influence attitudes and behaviors. According to social exchange theory, employees develop social exchange relationships with their organization and organizational authorities in which each party makes contributions to the relationship with the expectation of receiving comparable future benefits (Masterson et al., 2000). Employees seek both material (e.g., pay raises) and psychological (e.g., self-esteem) benefits from these ongoing organizational relationships (Brockner, 2002; Brockner, Siegel, Daly, Tyler, & Martin, 1997; Lind & Tyler, 1988). In social
164
A. BLAIR STALEY AND NACE R. MAGNER
exchange relationships, the exact nature of each party’s obligations is unspecified and the basis for measuring contributions is unclear (Blau, 1964, pp. 88–97; the following discussion of social exchange theory is based largely on Blau). Social exchange contrasts with economic exchange, where the exact nature of each party’s obligations is specified in advance, often by means of a formal contract that can be enforced through legal sanctions (Aryee, Budhwar, & Chen, 2002). The benefits that employees receive from the organization and organizational authorities in social exchange relationships obligate them to reciprocate to keep receiving the benefits. Employees reciprocate, and therefore preserve the social exchange relationship, primarily through behavior that furthers the goals of the organization (Masterson et al., 2000). While each party makes contributions to a social exchange relationship with a general expectation of some return, the exact nature of the return is left to the discretion of the one who will make it. There is no way to ensure an appropriate return for a contribution in a social exchange relationship and, in the short run, people are likely to perceive asymmetries between their contributions and the benefits they have received (Konovosky & Pugh, 1994). Therefore, a critical factor in the preservation of social exchange relationships is that each party trusts the other party to adequately discharge his or her obligations over the long run (Aryee et al., 2002; Blau, 1964; Konovosky & Pugh, 1994). Trust is the willingness of one party to be vulnerable to the actions of another party based on the expectation that the other will perform actions important to the trustor (Mayer, Davies, & Schoorman, 1995, p. 712). Research consistently indicates that procedural and interactional fairness in organizational decision making are important sources of employees’ trust in their organization and organizational authorities (e.g., Aryee et al., 2002; Brockner et al., 1997; Colquitt et al., 2001; Fichman, 2003; Konovosky & Pugh, 1994; Leung, Su, & Morris, 2001; Magner & Johnson, 1995; Magner & Welker, 1994; Mayer et al., 1995; Pillai, Schriesheim, & Williams, 1999; Van den Bos, Wilke, & Lind, 1998). Brockner (2002) and Brockner and Siegel (1996) attributed this phenomenon to information that procedural and interactional fairness convey to employees regarding the extent to which they can expect to receive valued benefits from their organization and organizational authorities over the long run. Next, Brockner’s explanations for why organizational fairness affects trust are applied to a budgeting setting and the hypotheses of the study are developed. Regarding interactional budgetary fairness, when budgetary decision makers carry out formal budgetary procedures in a fair manner through, for example, showing interpersonal sensitivity toward unit managers and
Testing a Social Exchange Model in a Government Budgeting Context
165
providing them with clear and adequate explanations for budgeting decisions, the managers are likely to view such treatment as a signal that they are valued members of the organization, which will provide them with psychological benefits like enhanced self-esteem. Also, such treatment indicates that budgetary decision makers have goodwill toward and mean well by the managers, which provides assurance to the managers that budgetary decision makers will take care to provide them with their fair share of material benefits such as favorable budgetary resource distributions and rewards linked to these budgets such as promotion and pay. Managers are likely to perceive their immediate supervisor as the budgetary decision maker who plays the primary role in the implementation of formal budgetary procedures, and thus as a key source of these valued material and psychological benefits. Furthermore, because of the fundamental attribution error (Gilbert & Malone, 1995; Heider, 1958; Ross, 1977), by which people tend to attribute the behavior of others to their relatively enduring personality traits, motives, and attitudes (i.e., their disposition) rather than to situational causes, managers will be inclined to view the way that the supervisor implements formal budgetary procedures as stable over time. Therefore, if managers perceive that their supervisor is currently implementing formal budgetary procedures in a fair manner, they will infer that the supervisor will continue to implement procedures fairly in the future, and thus will expect to receive the material and psychological benefits that stem from fair implementation over the long run. Based on the expectation that the supervisor will provide long-run benefits, managers will consequently be more willing to be vulnerable to the short-run actions of the supervisor in important organizational activities such as budgeting and thus have greater trust in the supervisor. The proposed linkage from interactional budgetary fairness to trust in supervisor leads to the first hypothesis of the study: H1. Interactional budgetary fairness has a positive effect on managers’ trust in supervisor. Regarding procedural budgetary fairness, formal organizational budgetary procedures that, for example, allow employees an opportunity to voice their opinions during budgeting and to appeal budgeting decisions, and that employ accurate information in budgeting decisions, signal to managers that they will realize their fair share of material benefits such as favorable budgets and consequent rewards. Fair formal budgetary procedures also provide managers with psychological benefits such as self-esteem in that the organization’s use of such procedures signifies that they are valued members
166
A. BLAIR STALEY AND NACE R. MAGNER
of the organization. Because formal organizational procedures are structural in nature and organizational inertia tends to allow structures to remain as they are, managers will be inclined to view the formal budgetary procedures as stable over time. Therefore, if managers believe that the formal budgetary procedures are currently fair, they will infer that the procedures will continue to be fair in the future, and thus will expect to receive the material and psychological benefits that stem from such procedures over the long run. Furthermore, because their supervisor is seen as a key agent of the organization and one whose budgetary behavior is constrained by the structure of the formal budgetary procedures, managers will perceive that fair formal budgetary procedures will enhance their supervisor’s ability to provide them with valued material and psychological benefits over the long run. Because managers expect to receive long-run benefits from the supervisor, they will allow themselves to be more vulnerable to the shortrun actions of the supervisor in areas such as budgeting, and thus have greater trust in the supervisor. The second hypothesis follows from the proposed linkage from procedural budgetary fairness to trust in supervisor: H2. Procedural budgetary fairness has a positive effect on managers’ trust in supervisor. Because interactional budgetary fairness and procedural budgetary fairness each enhance the extent to which managers trust their supervisor to provide valued material and psychological benefits over the long run, managers will feel an obligation to reciprocate in order to maintain the social exchange relationship. A viable way in which managers might reciprocate for the trust engendered by budgetary fairness is by exhibiting less propensity to submit biased budget estimates. Creating slack in the budget, a behavior intended to manipulate the organization’s budgetary process so as to further one’s own self-interests, is inconsistent with a desire to return the favor to the organization and organizational authorities for the design and implementation of fair budgetary procedures. The third hypothesis addresses the proposed linkage from trust in supervisor to propensity to create budgetary slack: H3. Managers’ trust in supervisor has a negative effect on their propensity to create budgetary slack. Taken together, the first three hypotheses indicate that trust in supervisor mediates the effects of interactional and procedural budgetary fairness on managers’ propensity to create budgetary slack. However, the hypotheses
Testing a Social Exchange Model in a Government Budgeting Context
167
do not address whether or not interactional and procedural budgetary fairness might in addition have direct effects on propensity to create budgetary slack that bypass trust in supervisor. If such direct effects exist, trust in supervisor would partially, but not fully, mediate interactional and procedural budgetary fairness effects on propensity to create slack. Given the centrality of trust to social exchange theories of organizational behavior, previous studies (Aryee et al., 2002; Konovosky & Pugh, 1994) have proposed, and provided empirical support for, models in which trust fully mediates relationships between interactional and procedural fairness in organizational decision-making and employee attitudes and behaviors. On the basis of this research, our last hypothesis proposes that trust in supervisor plays a similar role in the current budgeting context: H4. Managers’ trust in supervisor fully mediates the effects that interactional and procedural budgetary fairness have on their propensity to create budgetary slack.
METHOD Subjects and Procedures A questionnaire was mailed to all 9,643 members of the ‘‘Executive & Command’’ mailing list of U.S. federal government executives maintained by Government Executive Magazine. This group consists of senior-level executives who head federal agencies that provide services to the public or head main operating components of these agencies; the ‘‘Executive & Command’’ mailing list does not include federal executives primarily involved in internal support functions such as public and legal affairs, security, management information systems, engineering, purchasing, and personnel. During the formulation phase of the federal budgeting process for a given fiscal year, these executives must submit a budget request on behalf of their unit to the next highest level in the agency (in the case of operating components) or to the Office of Management and Budget (in the case of agencies) for resources to operate the unit during the year (Peters, 2004). Therefore, they participate in the budgetary process and have the opportunity to influence their unit’s budget, which is a necessary condition for them to build slack into the budget. The questionnaire included several mechanisms to ensure the respondents were, in fact, senior-level federal managers with budget responsibility. For
168
A. BLAIR STALEY AND NACE R. MAGNER
example, the lead paragraph of the questionnaire stated that the questionnaire was ‘‘for Federal managers who both supervise Federal personnel and have a budget.’’ Also, the respondents had to answer items that addressed whether or not they were a federal manager, the month and year that the most recent annual budget was established for their unit, and the highest-level unit for which they were responsible (e.g., bureau/agency, branch, office, division). If a respondent’s answers indicated that he or she was not a senior-level federal manager with budget responsibility, the questionnaire was not retained in the study. A total of 1,358 usable questionnaires were returned (a 14.1% response rate). Most (87.6%) respondents were male, and the respondents had a mean age of 49.9 years with a mean tenure of 24 years in the federal government.
Measures The variables in the proposed model were measured with scales used in previous research. Some of the respondents (e.g., agency heads) have budgetary roles both as a unit head requesting resources for their unit and as a budgetary decision maker reviewing the resource requests of subordinate managers. As the former role was the focus of the current study, respondents were told to rate each of the items measuring procedural budgetary fairness and interactional budgetary fairness in the context of ‘‘the most recent annual budget established for your unit.’’ As recommended by Colquitt (Colquitt, 2001; Colquitt et al., 2001), both the procedural and interactional budgetary fairness scales were indirect measures in which each item addresses a specific fairness criterion. In contrast to direct measures that simply ask how fair something is (e.g., ‘‘How fair are the organization’s formal budgetary procedures?’’), indirect measures provide insight into which fairness criteria people are applying (Colquitt, 2001, p. 380). Procedural budgetary fairness was measured with seven items developed by Moorman (1991). Because Moorman’s items relate to general organizational decision-making and do not reference specific decision makers, the items were adapted so as to reference the budgetary decision making context and budget decision makers. Respondents were asked to recall ‘‘the most recent annual budget established for your unit.’’ Representative items are: ‘‘Budget decision makers used formal procedures designed to collect accurate information necessary for making decisions on my budget,’’ ‘‘Budget decision makers used formal procedures designed to bring out the concerns of all those affected by the decisions on my budget,’’ and ‘‘Budget decision
Testing a Social Exchange Model in a Government Budgeting Context
169
makers used formal procedures designed to provide opportunities to appeal or challenge the decision on my budget.’’ Each item had a seven-point Likert response scale with the endpoints ‘‘Disagree’’ (1) and ‘‘Agree’’ (7), and the responses to the items were averaged. Moorman’s procedural fairness measure has been used in a number of previous organizational fairness studies (e.g., Goldman, 2001; Niehoff & Moorman, 1993). Interactional budgetary fairness was measured with five items developed by Moorman (1991). Each item was adapted to refer to ‘‘budget decision makers’’ rather than ‘‘your supervisor,’’ ‘‘employee’’ was replaced with ‘‘Federal manager’’ in one of the items, and one item was rephrased to refer specifically to budget decisions. Respondents were asked to recall ‘‘the most recent annual budget established for your unit.’’ Representative items are: ‘‘Budget decision makers treated me with kindness and consideration,’’ ‘‘Budget decision makers showed concern for my rights as a Federal manager,’’ and ‘‘Budget decision makers provided me with timely feedback about decisions on my budget and their implications.’’ Each item had a sevenpoint Likert response scale with the endpoints ‘‘Disagree’’ (1) and ‘‘Agree’’ (7), and the responses to the items were averaged. Moorman’s interactional fairness measure has been used in several previous organizational fairness studies (e.g., Niehoff & Moorman, 1993; Skarlicki & Folger, 1997). Trust in the immediate supervisor was measured with three items from Roberts and O’Reilly’s (1974) interpersonal trust scale. Representative items are ‘‘How free do you feel to discuss with your immediate superior the problems and difficulties you have in your job without jeopardizing your position or having it held against you later?’’ and ‘‘Immediate superiors at times must make decisions which seem to be against the interest of subordinates. When this happens to you, how much trust do you have that your immediate superior’s decision was justified by other considerations?’’ Each item had a seven-point Likert response scale with the endpoints (depending upon the item) ‘‘Very Cautious’’/‘‘Feel Very Distrustful’’/‘‘Have Little Confidence and Trust’’ (1) and ‘‘Completely Free’’/‘‘Trust Completely’’/ ‘‘Have Complete Confidence and Trust’’ (7), and the responses to the items were averaged. This measure has been used previously in both organizational fairness (e.g., Folger & Konovsky, 1989) and budgeting (e.g., Magner, Welker, & Campell, 1995) studies. Propensity to create budgetary slack was measured with four items, three of which were developed by Onsi (1973). One of Onsi’s items was adapted to be gender-neutral, to reference the ease of staying within a budget rather than safe attainment of a budget, and to replace ‘‘manager’’ with ‘‘Federal manager.’’ In a second item, the phrases ‘‘good business times,’’ ‘‘plant
170
A. BLAIR STALEY AND NACE R. MAGNER
manager,’’ and ‘‘departmental budget,’’ were replaced with ‘‘good times,’’ ‘‘Federal budget decision makers,’’ and ‘‘unit’s budget,’’ respectively. One item in Onsi’s original four-item scale related to a manager’s propensity to set two levels of production standards and did not apply to the current government budgeting context. This item was replaced with one addressing the effect of budgetary slack on organizational effectiveness. The items used in the current study are: ‘‘To protect himself (herself), a Federal manager must submit a budget that he (she) can easily stay within,’’ ‘‘In good times, Federal budget decision makers accept a reasonable level of slack in a unit’s budget,’’ ‘‘Slack in the budget is good to do things that cannot be officially approved,’’ and ‘‘A Federal organization runs more effectively when it has slack in its budget.’’ Each item had a seven-point Likert response scale with the endpoints ‘‘Disagree’’ (1) and ‘‘Agree’’ (7), and the responses to the items were averaged. Onsi’s scale has been used frequently in prior budgeting research (e.g., Govindarajan, 1986; Merchant, 1985; Nouri, 1994).
ANALYSIS AND RESULTS Means, Standard Deviations, Correlations, and Reliabilities Table 1 shows the means, standard deviations, correlations, and internal reliabilities for the variables of the study. All correlations were statistically significant ( po.05).1 The measures had high internal reliabilities, with the exception of the propensity to create budgetary slack measure, which had an alpha coefficient (.67) that is marginally below the .70 level recommended by Nunnally (1978).
Table 1.
Means, Standard Deviations, Correlations, and Reliabilitiesa.
Variable 1. 2. 3. 4.
Procedural budgetary fairness Interactional budgetary fairness Trust in supervisor Propensity to create budgetary slack
Mean
SD
1
2
3
4
3.85 4.15 5.14 3.73
1.53 1.58 1.63 1.26
(.94) .76 .40 .07
(.92) .44 .07
(.91) .11
(.67)
Significant at .05 level. Significant at .01 level. a
N=1,358. Alpha internal reliability coefficients are in parentheses on the diagonal.
Testing a Social Exchange Model in a Government Budgeting Context
171
Test of Nonresponse Bias As a test of nonresponse bias, the 80 earliest respondents were compared with the 80 latest respondents with respect to the four variables of interest (procedural budgetary fairness, interactional budgetary fairness, trust in supervisor, propensity to create budgetary slack) and three demographic variables (gender, age, tenure in the federal government). This test is based on the notion that late respondents are similar in important ways to nonrespondents (Armstrong & Overton, 1977). T-tests indicated no significant ( po.05) differences in the mean responses of the two groups. The results provide evidence that nonresponse bias was not a major problem in the study.
Primary Analysis The model in Fig. 1 was tested with latent variable structural equation analysis using the AMOS (Arbuckle, 2003) computer program. The analysis combined the proposed structural model specifying relationships between the four latent variables of interest with a measurement model that specifies relationships between the latent variables and the observed variables (i.e., individual questionnaire items) used to measure them. This approach adjusts the coefficients of the paths between the variables of interest in the structural model for the effects of random measurement error, which is beneficial since the internal reliability of the propensity to create budgetary slack measure was marginal. Structural equation modeling also provides indices for assessing the overall fit of the model to the data. For purposes of the analysis, procedural budgetary fairness and interactional budgetary fairness were treated as correlated exogenous variables, which statistically controls for the interrelationship between the two variables when the structural path coefficients in the model are estimated (Bollen, 1989). Chi-square (w2), the comparative fit index (CFI), and the root mean square error of approximation (RMSEA) were used to assess the overall fit of the model to the data. Table 2 presents these fit indices for the proposed model and, as a basis of comparison, for an independence model in which all the observed variables are assumed to be unrelated. Chi-square (w2 [149]=1029.23, po.001) of the proposed model was significant. Because w2 is really a ‘‘badness’’ of fit measure, this significant value suggests that the model did not have an acceptable fit to the data. However, when sample size is large, as in the current study, non-significant w2s are difficult to obtain even for structural equation models that have a good fit to the data from a
172
A. BLAIR STALEY AND NACE R. MAGNER
Table 2. Measures of Model Fit. Fit Measure Chi-square (degrees of freedom) Comparative fit index Nonnormed fix index Root mean square error of approximation
Proposed Model
Independence Model
1029.23 (149) .95 .95 .07
19184.42 (171) .00 .00 .29
Significant at the .001 level.
practical standpoint (Bagozzi & Heatherton, 1994; Hu & Bentler, 1995). Therefore, alternative measures of fit have been developed. The CFI and nonnormed fit index (NNFI), also known as the Tucker-Lewis Index, have been recommended as among the best of these measures (Gerbing & Anderson, 1993; Hu & Bentler, 1995; Marsh, Balla, & Hau, 1996). Both the CFI and NNFI were .95 in the current study, which met or exceeded both the traditional .90 (Medsker, Williams, & Holahan, 1994) and more rigorous .95 (Hu & Bentler, 1995, 1999) lower bounds recommended for good model fit. The RMSEA, which was .07, is below the .08 upper bound that Brown and Cudeck (1993) suggested represents reasonable model fit. On the whole, the fit indices indicate that the model had an acceptable fit to the data from a practical standpoint. Fig. 2 presents standardized coefficients for the hypothesized direct paths between the variables in the proposed structural model, standardized factor loadings for the items measuring each variable, and the correlation between the exogenous procedural and interactional budgetary fairness variables. All factor loadings were at least .50, with the exception of item 1 of the propensity to create budgetary slack scale. Of primary concern, the coefficients for the hypothesized direct paths between the variables in the proposed structural model were all significant at po.001 and in the expected direction. Hypothesis 1, which proposes that interactional budgetary fairness has a positive effect on trust in supervisor, was supported by the positive and significant direct path between interactional budgetary fairness and trust in supervisor. Hypothesis 2, which proposes that procedural budgetary fairness has a positive effect on trust in supervisor, was supported by the positive and significant direct path between procedural budgetary fairness and trust in supervisor. Hypothesis 3, which proposes that trust in supervisor has a negative effect on propensity to create budgetary slack, was supported by the negative and significant direct path between trust in supervisor and propensity to create budgetary slack.
Testing a Social Exchange Model in a Government Budgeting Context
Q1 .70
Q2 .79
Q3 .87
Q4
Q5
Q6 .87
.82
Q7
.84 .82
Procedural Budgetary Fairness
.14*** .82
.73
Q1
.79 Q2
Q1
Q2 Q3 .92 .91 Trust in Supervisor
Interactional Budgetary Fairness .72
173
Q1 .32 -.10***
Q2 .50
Q4
Q3 .77
.74
Propensity to Create Budgetary Slack
.32***
.83 .87 .82 Q3
Q4
Q5
*** Significant at the .001 level. a Q1, Q2…Qn represent the scale items measuring each variable of interest. Standardized factor loadings are presented for the arrows leading from each variable to its related scale items. Standardized path coefficients are presented for the unidirectional arrows from one variable to another. A correlation coefficient is presented for the bidirectional arrow between procedural budgetary fairness and interactional budgetary fairness.
Fig. 2.
Results of the Structural Equation Analysis of the Proposed Model.a
While the above results indicate that interactional and procedural budgetary fairness each reduce propensity to create budgetary slack indirectly by way of enhancing trust in supervisor, they do not explicitly address whether or not the two forms of budgetary fairness might also have a direct effect on propensity to create slack that bypasses trust in supervisor. To assess this possibility, the modification indices for the potential direct paths from interactional and procedural budgetary fairness to propensity to create budgetary slack that were not included in the proposed model were examined. The modification index for an excluded direct path between latent variables in a structural model indicates the change in w2 of the model if that path is added to the model. The modification indices for these two paths (interactional budgetary fairness to propensity to create budgetary slack=2.16, procedural budgetary fairness to propensity to create budgetary slack=2.83) were each less than the 5.0 cutoff suggested by Marsh and Hocevar (1985), indicating that the w2 fit of the model would not substantively improve by adding either of these paths to the model. As a second test of whether interactional budgetary fairness and procedural budgetary fairness have direct paths to propensity to create budgetary slack, these two paths were added to the proposed structural model and a latent variable structural equation analysis performed on this modified model. Neither direct path between a budgetary fairness variable and propensity to
174
A. BLAIR STALEY AND NACE R. MAGNER
create budgetary slack had a coefficient that approached conventional levels of significance (both ps>.40), while the paths in the model related to the hypotheses of the study were again significant. Taken together, the two tests provide evidence that interactional and procedural budgetary fairness each had only an indirect effect on propensity to create budgetary slack via the mediating variable trust in supervisor. These results support Hypothesis 4, which proposes that trust in supervisor fully mediates the effects that interactional and procedural budgetary fairness have on propensity to create budgetary slack.
DISCUSSION The results of this study indicate that both the fairness of an organization’s formal budgetary procedures and the fairness of budgetary decision makers’ implementation of these procedures are antecedents of managers’ propensity to build slack into budgets. Furthermore, managers’ trust in their immediate supervisor fully mediates the effects of both types of budgetary fairness on propensity to create budgetary slack. These findings, which support both hypotheses of the study, buttress a social exchange model of propensity to create budgetary slack. This model theorizes that managers develop social exchange relationships with both their organization and organizational authorities in which they make contributions with the expectation of receiving comparable future benefits. Procedural and interactional budgetary fairness enhance managers’ trust in the supervisor because budgetary fairness signifies to managers that the supervisor’s budget-related actions will enable them to realize both material and psychological benefits over the long run. To preserve their social exchange relationships with the supervisor, and thus receive these expected material and psychological benefits, managers will reciprocate for budgetary fairness and the resulting trust that it engenders by reducing their propensity to build slack into the budget. The current study contributes to the established literature addressing factors that influence managers’ propensity to create budgetary slack (Merchant, 1985; Nouri, 1994; Nouri & Parker, 1996; Onsi, 1973) as well as to the growing body of literature indicating that budgetary fairness influences employee attitudes and behaviors (e.g., Libby, 1999, 2001; Lindquist, 1995; Wentzel, 2002). The study also provides insight into the mechanisms by which budgetary fairness works – namely, by enhancing trust in supervisor. Future research should examine the model supported in this study with respect to other potential ways (beyond a reduced propensity to
Testing a Social Exchange Model in a Government Budgeting Context
175
create budgetary slack) that managers might choose to reciprocate for budgetary fairness and associated trust. These ways might include both budget-referenced reactions such as higher budget motivation and improved budgetary performance as well as more general reactions such as enhanced organizational citizenship behavior and lower turnover. Future studies should also investigate the specific reasons that procedural and interactional budgetary fairness enhance managers’ trust in their immediate supervisor. On the basis of prior psychology research, these effects were attributed in the current study to information that the two forms of budgetary fairness convey to managers regarding the extent to which they can expect to receive long-run material and psychological benefits from the supervisor. However, the explicit model tested here did not include variables that might provide detail regarding the processes that underlie the significant direct paths from the budgetary fairness variables to trust in supervisor. Because of this limitation, alternative explanations for these paths cannot be ruled out. Pertaining to this issue, Covaleski et al. (2003) recently noted that different theoretical perspectives from psychology-based research, economics-based research, and sociology-based research sometimes provide competing explanations of the causal processes underlying a given model of budgeting. Analysis of more complex models that specify variables such as managers’ expectations of favorable budgets in the future and managers’ self-esteem as intervening between budgetary fairness and trust in supervisor would provide evidence salient to confirming (or disconfirming) the explanation presented here. If this future research indicates that budgetary fairness enhances trust in supervisor, at least in part due to managers’ concerns with psychological benefits such as selfesteem rather than just material benefits such as favorable budgetary resource distributions and consequent rewards, it would support Luft’s (1997) argument that economic self-interest models, in which individuals are assumed to make choices only to maximize their material wealth, may not adequately explain all observed management accounting practice. An interesting research question is whether the current results from a government budgeting context generalize to budgeting in business organizations. A fundamental difference between these two budgeting settings that may have implications for this question relates to the relative importance each gives to the two basic roles of organizational budgeting: (a) the allocation of resources to organizational units, and (b) the evaluation of those units based on a comparison of actual versus budgeted results (Covaleski et al., 2003, p. 10). While government units use budgeting primarily for resource allocation, business organizations generally use budgeting as an integral part of performance evaluation as well. Because businesses place more emphasis
176
A. BLAIR STALEY AND NACE R. MAGNER
on the performance evaluation role of budgeting than do government units, the relationships that emerged in the current study of government managers may be stronger when tested with data from business managers. Business managers, as compared to government managers, are likely to see a more direct link between the budget established for their unit and the personal benefits they receive since their budget serves as a performance target against which they will be evaluated (and likely rewarded) rather than only as an authorization to spend. Therefore, business managers, when assessing the extent to which they can trust their supervisor to provide long-run benefits and deciding whether or not to build slack into their budget, may be more concerned than government managers with whether or not the formal budgetary procedures are fair and are being implemented fairly. Indirect evidence that some of the relationships found in the current study may be stronger when examined among business managers is provided by Little et al. (2002), whose subjects were primarily business managers and who reported that procedural and interactional budgetary fairness had correlations with propensity to create slack that were approximately 2.5 to 3 times greater than those in the current study. Beyond the question of whether the current results generalize to business managers, perhaps a more fundamental issue, given the approximately 14% questionnaire response rate, is whether the results pertain to the target population of senior-level U.S. federal government executives with budget responsibility. While this response rate is similar to that of previous studies involving federal government executives (e.g., de Lancer Julnes & Holzer, 2001; Yasin, Wafa, & Small, 2001), the possibility exists that people who did not return the questionnaire differ from the respondents in ways that are relevant to the relationships tested. The comparison of the earliest respondents with the latest respondents on the basis of the study’s primary variables and demographic items provides evidence, albeit not conclusive evidence, that this condition did not seriously undermine the results. Another limitation related to the respondents is that the extent to which they actually participate in the federal budgetary process is unknown. Budgetary participation will influence managers’ perceptions of budgetary fairness and is necessary to create slack in a budget. While federal managers are generally granted budgetary participation and that condition is an assumption of the study, the possibility exists that some of the respondents have little or no involvement in developing the budget for their unit. The structural equation analysis, which utilized data gathered with wellestablished measures of the four variables of interest, supported the proposed model by providing evidence that the hypothesized paths, but no other paths
Testing a Social Exchange Model in a Government Budgeting Context
177
between the variables, were significant, and that the overall model had a good fit to the data. However, this analysis only partly addresses three other limitations of the study. First, the direction of causality proposed between the variables cannot be proved because the data from each respondent were all gathered at a single point in time. For example, respondents’ trust in their immediate supervisor may actually be an antecedent, rather than a consequence, of their budgetary fairness perceptions. Second, one or more unmeasured variables omitted from the model may be common causal antecedents of the budgetary fairness variables, trust in immediate supervisor, and propensity to create budgetary slack. In this case, the significant model paths might have been weaker or even have disappeared had the omitted variables been controlled for. Third, reported relationships between the variables may be inflated because of a response bias due to the fact that none of the items comprising the four scales were reverse-scored (i.e., all the items were positively worded). While these three factors have the potential to undermine the value of the results, prior research suggests that their impact may not have been too great. Specifically, research spanning a variety of organizational decision contexts, including budgeting, consistently indicates that procedural and interactional fairness are antecedents of trust (e.g., Aryee et al., 2002; Brockner et al., 1997; Colquitt et al., 2001; Fichman, 2003; Konovosky & Pugh, 1994; Leung et al., 2001; Magner & Johnson, 1995; Magner & Welker, 1994; Mayer et al., 1995; Pillai et al., 1999; Van den Bos et al., 2001) and that trust is an antecedent of many types of functional employee behaviors and behavioral intentions (e.g., Aryee et al., 2002; Mayer et al., 1995; Pillai et al., 1999; Robinson, 1996). As noted in the front section of the paper, one benefit of studies that identify aspects of budgeting system design and implementation that influence managers’ propensity to create budgetary slack is that the results can give insight into ways of controlling slack. The current findings suggest that budgets may contain less slack, and thus be more effective for resource allocation purposes, if organizations and their officials take steps to design fair formal budgetary procedures and to implement these procedures in a fair way. The individual items comprising the indirect measures for the two fairness constructs, all of which had standardized loadings greater than .70 in the current study, indicate specific ways to enhance budgetary fairness in organizations. For example, senior-level officials should design formal budgetary procedures that collect accurate budgetary information and ensure that budget decisions are made in a consistent fashion; that allow managers to voice their concerns regarding budget decisions and to appeal those decisions; and that give managers adequate feedback regarding budget
178
A. BLAIR STALEY AND NACE R. MAGNER
decisions and allow them to request clarification about the decisions. When implementing budgetary procedures, budgetary decision makers should treat unit managers with kindness, show regard for their rights as employees, deal with them in a truthful manner, suppress personal biases, give consideration to the managers’ views and opinions, and provide timely feedback about budget decisions and their implications. Given that these practices and their associated benefits are likely not entirely evident to managers involved in the design and implementation of an organization’s budgeting system, organizations should utilize mechanisms like training seminars to educate managers as to how they can promote procedural and interactional budgetary fairness.
NOTE 1. Some research (e.g., Armstrong-Stassen, 1998) has indicated that women perceive less fairness in organizational decision-making than do men. To examine this phenomenon among the current respondents, gender was correlated with both procedural budgetary fairness and interactional budgetary fairness. A significant correlation (r=.08, po.01) emerged between gender and interactional budgetary fairness, with men reporting more fairness than women. The correlation between gender and procedural budgetary fairness was not significant. Regarding the other two major variables of interest in the study, gender had a significant correlation with trust in supervisor (r=.10, po.001), with men reporting more trust than women, but was not significantly correlated with propensity to create budgetary slack. Two other demographic variables, tenure and age, were also correlated with the four variables of interest. Tenure had a significant correlation with interactional budgetary fairness (r=.07, po.05), procedural budgetary fairness (r=.07, po.05), and trust (r=.07, po.05). Age did not have a significant correlation with any of the variables of interest. On whole, the correlations indicate that respondents who are men and those who have relatively longer tenure with the federal government reported somewhat higher levels of budgetary fairness and trust in supervisor.
ACKNOWLEDGMENTS We gratefully acknowledge Joel Brockner for providing us with his insight and suggestions regarding this paper.
REFERENCES Arbuckle, J. (2003). AMOS version 5.0. Chicago, IL: SmallWaters Corporation. Armstrong, J. S., & Overton, T. S. (1977). Estimating nonresponse bias in mail surveys. Journal of Marketing Research, 14, 396–402.
Testing a Social Exchange Model in a Government Budgeting Context
179
Armstrong-Stassen, M. (1998). The effect of gender and organizational level on how survivors appraise and cope with organizational downsizing. Journal of Applied Behavioral Science, 34, 125–142. Aryee, S., Budhwar, P. S., & Chen, Z. X. (2002). Trust as a mediator of the relationship between organizational justice and work outcomes: Test of a social exchange model. Journal of Organizational Behavior, 23, 267–285. Bagozzi, R. P., & Heatherton, T. F. (1994). A general approach to representing multifaceted personality constructs: Application to state self-esteem. Structural Equation Modeling, 1, 35–67. Barrett-Howard, E., & Tyler, T. R. (1986). Procedural justice as a criterion in allocation decisions. Journal of Personality and Social Psychology, 50, 296–304. Bies, R. J., & Moag, J. S. (1986). Interactional justice: Communications criteria of fairness. In: R. Lewicki, M. Bazerman & B. H. Sheppard (Eds), Research on negotiation in organizations (Vol. 1, pp. 43–55). Greenwich, CT: JAI Press. Blau, P. (1964). Exchange and power in social life. New York, NY: Wiley. Bollen, K. A. (1989). Structural equations with latent variables. New York, NY: Wiley. Brockner, J. (2002). Making sense of procedural fairness: How high procedural fairness can reduce or heighten the influence of outcome favorability. Academy of Management Review, 27, 58–76. Brockner, J., & Siegel, P. A. (1996). Understanding the interaction between procedural justice and distributive justice: The role of trust. In: R. M. Kramer & T. R. Tyler (Eds), Trust in organizations (pp. 390–413). Thousand Oaks, CA: Sage. Brockner, J., Siegel, P. A., Daly, J. P., Tyler, T. R., & Martin, C. (1997). When trust matters: The moderating effect of outcome favorability. Administrative Science Quarterly, 42, 558–583. Brockner, J., & Wiesenfeld, B. M. (1996). An integrative framework for explaining reactions to decisions: Interactive effects of outcomes and procedures. Psychological Bulletin, 120, 189–208. Brown, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In: K. A. Bollen & J. S. Long (Eds), Testing structural equation models (pp. 136–162). Newbury Park, CA: Sage. Chow, C. W., Cooper, J. C., & Waller, W. S. (1988). Participative budgeting: Effects of a truthinducing pay scheme and information asymmetry on slack and performance. The Accounting Review, 63, 111–122. Colquitt, J. A. (2001). On the dimensionality of organizational justice: A construct validation of a measure. Journal of Applied Psychology, 86, 386–400. Colquitt, J. A., Conlon, D. E., Wesson, M. J., Porter, C. O. L. H., & Ng, K. Y. (2001). Justice at the millennium: A meta-analytic review of 25 years of organizational justice research. Journal of Applied Psychology, 86, 425–445. Covaleski, M. A., Evans, J. H., III., Luft, J. L., & Shields, M. D. (2003). Budgeting research: Three theoretical perspectives and criteria for selective integration. Journal of Management Accounting Research, 15, 3–49. Diekmann, K. A., Samuels, S. M., Ross, L., & Bazerman, M. H. (1997). Self-interest and fairness in problems of resource allocation: Allocators versus recipients. Journal of Personality and Social Psychology, 72, 1061–1074. Dunk, A. S. (1993). The effect of budget emphasis and information asymmetry on the relation between budgetary participation and slack. The Accounting Review, 68, 400–410.
180
A. BLAIR STALEY AND NACE R. MAGNER
Executive Office of the U.S. President, Office of Management and Budget (2004). The budget system and concepts, budget of the United States government, fiscal year 2005 [On-line]. Available: http://www.whitehouse.gov/omb/budget/fy2005/pdf/concepts.pdf Fichman, M. (2003). Straining towards trust: Some constraints on studying trust in organizations. Journal of Organizational Behavior, 24, 133–157. Folger, R., & Konovsky, M. A. (1989). Effects of procedural and distributive justice on reactions to pay raise decisions. Academy of Management Journal, 32, 115–130. Gerbing, D. W., & Anderson, J. C. (1993). Monte Carlo evaluations of goodness-of-fit indices for structural equation models. In: K. A. Bollen & J. S. Long (Eds), Testing structural equation models (pp. 40–65). Newbury Park, CA: Sage. Gilbert, D. T., & Malone, P. S. (1995). The correspondence bias. Psychological Bulletin, 117, 21–38. Goldman, B. M. (2001). Toward an understanding of employment discrimination claiming: An integration of organizational justice and social information processing theories. Personnel Psychology, 54, 361–385. Govindarajan, V. (1986). Impact of participation in the budgetary process on managerial attitudes and performance: Universalistic and contingency perspectives. Decision Sciences, 17, 496–516. Greenberg, J. (1986). Determinants of perceived fairness of performance evaluations. Journal of Applied Psychology, 71, 340–342. Greenberg, J. (1990). Organizational justice: Yesterday, today, and tomorrow. Journal of Management, 16, 399–432. Heider, F. (1958). The psychology of interpersonal relations. New York, NY: Wiley. Hu, L. T., & Bentler, P. M. (1995). Evaluating model fit. In: R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 76–99). Thousand Oaks, CA: Sage. Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indices in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55. Konovosky, M. A., & Pugh, S. D. (1994). Citizenship behavior and social exchange. Academy of Management Journal, 37, 656–669. de Lancer Julnes, P., & Holzer, M. (2001). Promoting the utilization of performance measures in public organizations: An empirical study of factors affecting adoption and implementation. Public Administration Review, 61, 693–707. Leung, K., Su, S., & Morris, M. W. (2001). When is criticism not constructive? The roles of fairness perceptions and dispositional attributions in employee acceptance of critical supervisory feedback. Human Relations, 54, 1155–1187. Leung, K., Tong, K., & Ho, S. S. (2004). Effects of interactional justice on egocentric bias in resource allocation decisions. Journal of Applied Psychology, 89, 405–415. Leventhal, G. S. (1980). What should be done with equity theory? In: K. J. Gergen, M. S. Greenberg & R. H. Willis (Eds), Social exchange: Advances in theory and research (pp. 27–55). New York, NY: Plenum Press. Libby, T. (1999). The influence of voice and explanation on performance in participative budgeting. Accounting, Organizations and Society, 24, 125–137. Libby, T. (2001). Referent cognitions and budgetary fairness: A research note. Journal of Management Accounting Research, 13, 91–105. Lind, E. A., & Tyler, T. (1988). The social psychology of procedural justice. New York, NY: Plenum Press.
Testing a Social Exchange Model in a Government Budgeting Context
181
Lindquist, T. M. (1995). Fairness as an antecedent to participative budgeting: Examining the effects of distributive justice, procedural justice, and referent cognitions on satisfaction and performance. Journal of Management Accounting Research, 7, 122–147. Little, H. T., Magner, N., & Welker, R. B. (2002). The fairness of formal budgetary procedures and their enactment. Group and Organization Management, 27, 209–225. Luft, J. L. (1997). Fairness, ethics, and the effect of management accounting on transaction costs. Journal of Management Accounting Research, 9, 199–216. Magner, N., & Johnson, G. G. (1995). Municipal officials’ reactions to justice in budgetary resource allocation. Public Administration Quarterly, 18, 441–456. Magner, N., Johnson, G. G., Sobery, J. S., & Welker, R. B. (2000). Enhancing procedural justice in local government budget and tax decision making. Journal of Applied Social Psychology, 30, 798–815. Magner, N., & Welker, R. B. (1994). Responsibility center managers’ reactions to justice in budgetary resource allocation. Advances in Management Accounting, 3, 237–253. Magner, N., Welker, R. B., & Campell, T. L. (1995). The interactive effect of budgetary participation and budget favorability on attitudes toward budgetary decision makers: A research note. Accounting, Organizations and Society, 20, 611–618. Marsh, H. W., Balla, J. R., & Hau, K. (1996). An evaluation of incremental fit indices: A clarification of mathematical and empirical properties. In: G. A. Marcoulides & R. E. Schumacker (Eds), Advanced structural equation modeling: Issues and techniques (pp. 315–353). Mahwah, NJ: Lawrence Erlbaum. Marsh, H. W., & Hocevar, D. (1985). Applications of confirmatory factor analysis to the study of self-concept: First- and higher-order factor models and their invariance across groups. Psychological Bulletin, 97, 562–582. Masterson, S. S., Lewis, K., Goldman, B. M., & Taylor, M. S. (2000). Integrating justice and social exchange: The differing effects of fair procedures and treatment on work relationships. Academy of Management Journal, 43, 738–748. Mayer, R. C., Davies, J. H., & Schoorman, F. D. (1995). An integrative model of organizational trust. Academy of Management Review, 20, 709–734. Medsker, G. J., Williams, L. J., & Holahan, P. J. (1994). A review of current practice for evaluating causal models in organizational behavior and human resources management research. Journal of Management, 20, 439–464. Merchant, K. A. (1985). Budgeting and the propensity to create slack. Accounting, Organizations and Society, 10, 201–210. Merchant, K. A., & Manzoni, J. (1989). The achievability of budget targets in profit centers: A field study. The Accounting Review, 64, 539–558. Moorman, R. H. (1991). Relationship between organizational justice and organizational citizenship behaviors: Do fairness perceptions influence employee citizenship? Journal of Applied Psychology, 76, 845–855. Niehoff, B. P., & Moorman, R. H. (1993). Justice as a mediator of the relationship between methods of monitoring and organizational citizenship behavior. Academy of Management Journal, 36, 527–556. Niskanen, W. A., Jr. (1971). Bureaucracy and representative government. Chicago, IL: Aldine. Nouri, H. (1994). Using organizational commitment and job involvement to predict budgetary slack: A research note. Accounting, Organizations and Society, 19, 289–295.
182
A. BLAIR STALEY AND NACE R. MAGNER
Nouri, H., & Parker, R. J. (1996). The effect of organizational commitment on the relation between budgetary participation and budgetary slack. Behavioral Research in Accounting, 8, 74–90. Nunnally, J. (1978). Psychometric theory (2nd ed.). New York, NY: McGraw-Hill. Onsi, M. (1973). Factor analysis of behavioral variables affecting budgetary slack. The Accounting Review, 48, 535–548. Peters, B. G. (2004). American public policy: Promise and performance (6th ed.). Washington, DC: CQ Press. Pillai, R., Schriesheim, C. A., & Williams, E. S. (1999). Fairness perceptions and trust as mediators for transformational and transactional leadership: A two-sample study. Journal of Management, 25, 897–933. Roberts, K. H., & O’Reilly, C. A. (1974). Measuring organizational communication. Journal of Applied Psychology, 59, 321–326. Robinson, S. L. (1996). Trust and breach of psychological contract. Administrative Science Quarterly, 41, 574–599. Ross, L. (1977). The intuitive psychologist and his shortcomings. In: L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 10, pp. 173–220). San Diego, CA: Academic Press. Senate Bill 20 (1993). Government Performance and Results Act of 1993, S. 20, 103d Cong., 1st Sess. Skarlicki, D. P., & Folger, R. (1997). Retaliation in the workplace: The role of distributive, procedural, and interactional justice. Journal of Applied Psychology, 82, 434–443. Stevens, D. E. (2002). The effects of reputation and ethics on budgetary slack. Journal of Management Accounting Research, 14, 153–171. Tyler, T. R. (1988). What is procedural justice? Criteria used by citizens to assess the fairness of legal procedures. Law and Society Review, 22, 301–355. Tyler, T. R., & Bies, R. J. (1990). Beyond formal procedures: The interpersonal context of procedural justice. In: J. Carroll (Ed.), Applied social psychology and organizational settings (pp. 77–98). Hillsdale, NJ: Erlbaum. U.S. General Accounting Office (2000). Managing for results: Federal managers’ views show need for ensuring top leadership skills (GAO-01-127). Washington, DC: Author. Van den Bos, K., Wilke, H. A. M., & Lind, E. A. (1998). When do we need procedural fairness? The role of trust in authority. Journal of Personality and Social Psychology, 75, 1449–1458. Van der Stede, W. A. (2000). The relationship between two consequences of budgetary controls: Budgetary slack creation and managerial short-term orientation. Accounting, Organizations and Society, 25, 609–622. Wentzel, K. (2002). The influence of fairness perceptions and goal commitment on managers’ performance in a budget setting. Behavioral Research in Accounting, 14, 247–271. Wildavsky, A. (1984). The politics of the budgetary process. Boston, MA: Little, Brown. Yasin, M. M., Wafa, M. A., & Small, M. H. (2001). Just-in-time implementation in the public sector: An empirical examination. International Journal of Operations and Production Management, 21, 1195–1204. Young, S. M. (1985). Participative budgeting: The effects of risk aversion and asymmetric information on budgetary slack. Journal of Accounting Research, 23, 829–842.
THE QUESTION OF DISCLOSURE: PROVIDING A TOOL FOR EVALUATING MANAGEMENTS’ DISCUSSION AND ANALYSIS Lori Holder-Webb ABSTRACT The quality of Management Discussion and Analysis (MD&A) disclosure is of perennial concern to regulators and practitioners and has recently gained the attention of investors with the formation of the Enhanced Business Reporting Consortium. Research on the quality of narrative elements of the business report such as the MD&A has long been hampered by a lack of empirical tools to permit objective analysis of qualitative disclosures. The need for such a tool is not limited to analysis of existing disclosures; current policy calls for expanded disclosure that may come at the cost of investor cognitive overload. This paper provides a more intensive tool for evaluating the qualitative disclosures contained in the MD&A than is currently publicly available. The tool yields a quantitative measure of disclosure that can be used in further empirical or experimental research, or by those wishing to objectively compare disclosures across managers, across firms, or across time.
Advances in Accounting Behavioral Research, Volume 10, 183–217 Copyright r 2007 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1016/S1475-1488(07)10008-9
183
184
LORI HOLDER-WEBB
INTRODUCTION This paper provides a tool for objectively evaluating the quality of disclosures in the Management’s Discussion and Analysis (MD&A) section of the Annual Report. The primary extant tools available to researchers who wish to examine disclosure behavior at the firm level either do not permit comparison between firms issuing only 10-Ks and those for whom the 10-K is included in the Annual Report (the Botosan Index), or do not permit direct measurement of the disclosures at all, relying strictly on informed third party opinions about the general quality of the disclosure package (the FAF or AIMR reports). This creates a difficulty for researchers who wish to directly evaluate the disclosure of firms in an environment where multiple reporting formats are permitted or under circumstances of imperfect archiving. The MD&A is required regardless of reporting format; this study is therefore an attempt to permit direct evaluation of the disclosure behavior of any firm that is publicly traded on a U.S. exchange. The quality of business reporting has been of perennial concern to regulators and practitioners and has recently gained impetus with the formation of the Enhanced Business Reporting Consortium (EBRC), a group of industry and market participants concerned about the role of quality reporting in the smooth functioning of the capital markets. The EBRC has, as a main goal, the development of a conceptual framework for disclosures (Enhanced Business Reporting Consortium, 2005).1 The EBRC framework calls for disclosures that ‘‘provide information about the economic resources and performance of an enterprise and help users assess the amounts, timing, and uncertainty of prospective net cash flows. In particular, disclosures should describe and provide disaggregated information for recognized items, describe unrecognized items, and provide information to help investors and creditors assess the risks and potential of both recognized and unrecognized items’’ (EBRC, 2005). The EBRC criteria cover ground that is traditionally relegated to the MD&A section of the Annual Report, considered by the SEC to be a critical component of the firm’s annual report (Securities Exchange Commission, 2003). The MD&A is ‘‘intended to give the investor an opportunity to look at the company through the eyes of management by providing both a short and long-term analysis of the business of the company’’ (Securities Exchange Commission, 1987). Subsequent to issuing the Concept Release in 1987, the SEC reviewed the MD&As of 359 companies and found that 345 of those were inadequate under the regulators’ expectations. Of the 345 firms, 134 had disclosures sufficiently deficient as to require immediate amendment
Providing a Tool for Evaluating Managements’ Discussion and Analysis
185
(Heyman, 1989). This high rate of deficiencies resulted in further authoritative statements on the subject from the SEC, in the form of Financial Reporting Release No. 36 (FRR 36) (Securities Exchange Commission, 1989). In FRR 36, the SEC stated that the MD&A was an essential disclosure necessary to investors’ ability to assess the prospects of the company over the future and provided a laundry list of disclosures based on the individual inadequacies identified in the earlier study. Despite the release of FRR 36, the SEC continued to issue numerous and regular warnings about the content of the MD&A to the investing community and to individual firms throughout the 1990s. As a result of this ongoing level of regulatory dissatisfaction, a petition was submitted to the SEC requesting further clarification of the reporting requirements for the MD&A. This petition was signed by the largest accounting firms, endorsed by the AICPA, and delivered to the SEC at the end of 2001. The petitioners noted that ‘‘The existing requirements of MD&A are intentionally broad and flexible to permit registrants to meet the requirements for disclosure in a manner that is customized to each registrant’s circumstances. The flexibility in the rule was intended to elicit high quality and highly informative disclosure’’ (Securities Exchange Commission, 2001). This request resulted in the issuance of FR-72 (Securities Exchange Commission, 2003), wherein the SEC stated that ‘‘y disclosure must be both useful and understandabley investors will often find information relating to a particular matter more meaningful if it is disclosed in a single location, rather than presented in a fragmented manner throughout the filing.’’ Clearly, regulators designated with maintaining the integrity of the U.S. capital markets consider the ready availability of qualitative information about the firm to be essential to the smooth operation of those markets. The need for transparency only continued to be highlighted with a wave of reporting and business failures, such as those at Enron, Global Crossing, and WorldCom. Increased transparency and disclosure are significant components of the Sarbanes-Oxley Act of 2002 (hereafter, Sarbanes-Oxley Act). Specifically, the Act mandates new disclosures about off-balance-sheet financing, internal controls, pro-forma reports, and insider-trading activity (Sarbanes-Oxley Act, 2002). The Sarbanes-Oxley Act is one of the most costly pieces of legislation in the recent history of corporate America, and as such, is providing grist for a fresh and developing literature in disclosure research (e.g., Paredes, 2003; Cohen, Dey, & Lys, 2005; Bratton, 2003). This study develops a new tool for evaluating the qualitative information contained in the MD&A section of the Annual Report, and yields a quantitative disclosure measure that can be used for comparison or in further
186
LORI HOLDER-WEBB
statistical analysis. The strength of this measure is that it permits assessment of a more broad population of firms than was previously possible in an attempt to redress a lack in the toolkit available to those interested in the rigorous evaluation of disclosure practices.
LIMITATIONS OF CURRENT MEASURES The majority of reporting quality research is based on analytical models and on the analysis of market data (for a review of this literature see Verrecchia, 2001; also Kim & Verrecchia, 1991, 2001; Diamond & Verrecchia, 1991 for seminal works in this area). While these studies contribute significantly to our understanding of the role of disclosure in the markets, they do not shed light on the qualitative attributes of the disclosures themselves, nor on the content of the disclosures. The remaining disclosure quality research is divided into two broad paths: analysis of the firm’s overall information environment and analysis of particular elements of that environment. Studies in the first category tend to employ broad measures of quality, generally based on external parties’ ratings of the firm’s information environment (as in Lang & Lundholm, 1993, 1996, 2001; Healy, Hutton, & Palepu, 1999; Botosan & Plumlee, 2002, among others). Studies in the second category rely upon content analysis, frequently executed using off-the-shelf content analysis software packages (Frazier, Ingram, & Tennyson, 1984; Bryan, 1997; Clarkson, Guedes, & Thompson, 1999) or on proprietary measures (as in Botosan, 1997). Analyst ratings of the information environment are developed within a rich information context that incorporates knowledge about the firm, its industry, and prevailing trends in disclosure, and thus provide a simulation of investor perceptions of disclosure quality in the actual investing environment. However, they are formulated from an existing basis of information about the firm, its industry, and trends in disclosure and do not offer an opportunity to examine the actual disclosure. These measures also relegate the analysts’ evaluation process to a ‘‘black box,’’ which permits little assessment of the role of the business reports in the formulation of decisions or judgment. They are also difficult to obtain. Existing databases cover only a limited span of years, and only the largest firms are rated. Use of this data, therefore, necessarily introduces a data-driven size bias into studies. Furthermore, they are not useful for analyzing firms with recent public offerings, those in distress, or those at general risk of contracting problems (i.e., small or closely
Providing a Tool for Evaluating Managements’ Discussion and Analysis
187
held firms). Researchers who are interested in exploring disclosure quality issues among smaller, less well known, distressed, or younger firms are therefore unable to use existing archival databases. The second strain of firmbased disclosure quality measures is based on content analysis (primarily computerized), or the creation and application of indices. While Frazier et al. (1984) describe a computerized content-analytic based procedure for evaluating narrative accounting disclosures (footnotes), this approach is not appropriate for less formulaic disclosures such as the MD&A.2 The limitations of archival disclosure data and computerized content analysis are to create a pressing need for an alternative approach to evaluating disclosure quality for the MD&A of smaller firms. This area includes studies like those of Hooks and Moon (1993) and Botosan (1997). Hooks and Moon (1993) provide a checklist of disclosures required and/or suggested by the SEC in the early 1990s. However, this checklist is designed for broad assessment and does not yield a quantitative score that is suitable for further analysis. Furthermore, the SEC has in recent years expanded the domain of disclosures that are suitable for inclusion in the MD&A. The Hooks and Moon (1993) checklist has become out-of-date. Botosan (1997) provides a validated quantitative approach for measuring the content of the entire Annual Report, including the MD&A. However, comparisons of firms using the Botosan Index require, as a function of the index contents, that all firms employ identical reporting formats. That is, her index is sensitive to the inclusion (exclusion) of information mandatory to the 10-K (Annual Report); its use in a context where multiple reporting formats are permitted or where data availability problems preclude obtaining identical reporting formats will engender mechanical variability among the scores of different firms. Furthermore, the MD&A section of the Botosan Index is relatively short, and permits only the most cursory examination of MD&A disclosures. The current study addresses the limitations of both these measures by updating the coverage of the MD&A for the prevailing information environment, by providing a validated method for transforming qualitative disclosures into a quantitative variable suitable for statistical analysis, and by permitting more intensive evaluation of the MD&A. Furthermore, the extant body of tools limits the ability of researchers to directly measure disclosure quality of firms in an environment where multiple reporting formats are permitted or in an environment characterized by imperfect archiving of published reports. The MD&A, on the other hand, is a required element of any published Annual Report or 10-K, and does not differ in content based upon the choice of that reporting format. While the MD&A Index permits
188
LORI HOLDER-WEBB
assessment of a smaller portion of the disclosure package than does the extant collection of tools, it permits evaluation of what regulators consider to be an extremely important component of that package and significantly broadens the population of firms for which disclosure quality can be evaluated.
THE INDEX The MD&A Index (the Index) provides the ability to evaluate the disclosure quality of the firm’s MD&A and to generate a quantitative value for that quality that may then be used in further empirical analysis. The Index is similar in style to that developed by Botosan (1997), which is used to generate a quantitative measure of the disclosure quality for an entire Annual Report. The contents of the Index are based upon the SEC guidelines as provided through FR-72 (Securities Exchange Commission, 2003), Financial Reporting Release 36 (FRR 36) (Securities Exchange Commission, 1989), a practitioner checklist provided and validated by Hooks and Moon (1991, 1993), and a review of internal practitioner guidelines developed by a major international accounting firm. In this material, the Index also covers many of the items addressed in Level 1 of the Enhanced Business Reporting Framework (EBRC, 2005). The items chosen for inclusion in the Index are based on an interpolation of practice and regulatory emphasis. Individual items were drawn both from an extensive checklist for MD&A preparation provided by a major international accounting firm (provided under conditions of anonymity), and from the many statements issued by the SEC on the desired content of the MD&A and cited above. In general, the SEC pronouncements emphasize disclosures of trends, projections, and other forward-looking information, as well as the need to provide useful and complete information about the reasons why current results may not be indicative of future results. Therefore, most items are eligible both for credit and for basic and enhanced levels of disclosure. The items that received repeated emphasis in the SEC pronouncements are broken out into multiple items on the Index, which implicitly assigns a higher weighting to these items in the computation of the DQUAL score.3 In summary, the Index contains a considerable amount of weighting for items that have been deemed by practitioners and/or regulators to be highly significant disclosures, and the determination of what items should be weighted more heavily was made pursuant to the relative amount of emphasis received by those items in the regulatory pronouncements.4
Providing a Tool for Evaluating Managements’ Discussion and Analysis
189
The index covers the general business and the business environment, the results of operations, information about capital resources, and the company’s liquidity situation. Additional items are those useful for decision-making, including descriptions of the company’s primary business lines; the contribution of each line to the financial results; descriptions or discussions of major products, customers, markets, and facilities; order backlogs; employee counts; and the effect of recent acquisition activity on current results. There are a total of 46 items, worth a maximum total of 82 points, on the Index. As the Index is sensitive to the business context (i.e., not all firms will have all possible items to disclose), the total of 82 is adjusted downward to reflect the absence of situation-dependent disclosure items. The complete Index, including scoring directions, appears as Appendix A. Appendix B contains two scored sample excerpts for purposes of illustration.
General Directions for Use Preliminary work on each MD&A should include a thorough reading of the report and other public information releases from the firm during the year (excluding the MD&A) to determine the presence of multiple operating segments, key organizational changes, liquidity problems, unusual or significant events, and unusual or extraordinary items. As noted above, the Index is sensitive to the presence or absence of situation-dependent disclosure opportunities. Managers are therefore rewarded for providing discussion of relevant items, but only penalized for non-disclosure when it is certain that a missed disclosure opportunity exists. Common items that fall into this category have been indicated with an NA (not applicable) notation on the item line. For example, firms may not have had any extraordinary items, may operate only in one segment, and may not have been engaged in any MD&A activity during the fiscal year. Service firms, similarly, are unlikely to have order backlogs. The denominator of the Index should be adjusted downward by the number of available points for any item that may exist for a given firm in a particular year to ensure that firms are not penalized inappropriately. Credit for discussing operating results is contingent upon the manager providing basic information that is incremental, to that contained in the financial statements. For example, no credit would be awarded for a statement that revenues have increased, or even for a statement that revenues have increased by a given quantity. Managers must explain the increase with a discussion of price or volume changes to receive basic credit. The Index is also sensitive to the detailed level of information provided about a topic, in
190
LORI HOLDER-WEBB
that additional credit is available for most operating items if the managers provide enhanced (more detailed or quantitative) information. Items that have been specifically emphasized by the SEC in the most recent definitive clarification of requirements (SEC, 2003) are likewise emphasized in the instrument. In particular, detailed discussions of unusual or non-recurring items and providing projections, trends, and forecasts (all items specifically noted in FR-72) provide an opportunity for significant boosts in the final disclosure quality score. Firms receive basic credit for mentioning the unusual item and enhanced credit for providing quantitative information on the items impact on current operations; beyond this, managers may expand the disclosure by offering discussion of the projected future effects of the item and possibly quantifying those future effects as well. For example, if the firm restructured its operations during the year, 1 point would be awarded for a basic discussion of the restructuring that lists the components of the charge, and another point for enhanced discussion if the managers provided dollar figures for each component, for a total of 2 points. An additional point is available to managers who provide a basic discussion of the implications of the restructuring for future operations, as well as a final point for enhancing that discussion by quantifying these implications. An explanation of a restructuring charge that included the information that the charge is $119 million ($45 million for employee layoffs, $20 million for write-offs, and the remainder to building closures) and that this action is expected to result in cost savings approximating $8 million after 1996 would receive the highest possible credit for this category, 4 points. The number of points available on this section (and all other sections of the Index) is contingent upon the presence of certain elements in the company’s operating environment. Where a given element did not exist for the time period under consideration, the scoring form includes the term ‘‘NA’’ and those points are deducted from the maximum number of points available. For example, if a firm has no extraordinary or unusual items to report in the current period, a total of 4 points would be deducted from the 82 point total for possible items (leaving a maximum possible score for that firm of 78 points before considering other potentially NA items). The capital and liquidity sections follow in a similar vein; credit is awarded for basic discussions of relevant information, extra credit is awarded for enhanced discussions offering more detail or quantifications pertaining to the item, and the disclosure points are adjusted for items that are not relevant to the firm during that fiscal year. Detailed instructions for all items are included in Appendix A as part of the instrument.5
Providing a Tool for Evaluating Managements’ Discussion and Analysis
191
Computing the Disclosure Score The final product of this process is a quantitative disclosure score (DQUAL) that represents the number of disclosure opportunities taken relative to the number of disclosure opportunities presented for the firm during the period covered by the report. Firms with poor or low-quality disclosures will either miss the points entirely (if the item is not discussed), or will be awarded half of the available credit (if the item is discussed in basic terms only, not in significant detail). After considering the firm’s operations and other information releases during the year to evaluate which, if any, of the NA items should be omitted from the score, the MD&A should be scored with the Index. Appendix A includes a useful form for recording these scores and directions for evaluating each applicable and situation-dependent variable. After scoring each item according to the directions, the sum total of points awarded should be divided by the sum total of points available to the firm (82 minus the points available to situation-dependent disclosures, or NA items, not applicable to the firm during the period). The result of this computation is a percentage of firm-specific disclosure opportunities that were seized by the reporting firm (DQUAL). If more disclosure is better, the disclosure score translates to a measure of disclosure quality; if it is not (as suggested by Paredes, 2003), the score is a measure of quantity. Appendix B contains excerpts from scored reports, along with the scoring form and points awarded for the excerpts.6 Both excerpts relate to the discussion of current operating results. The first is for a firm with very lowquality disclosures, and the disclosure has been included in its entirety. The second disclosure is from the opposite end of the spectrum, and has been edited for length. Deleted sections are summarized in brackets. Each excerpt is followed by the segment of the MD&A Index pertaining to the discussion of operating results. The DQUAL for the first firm is very low, under .2. The second firm has a more extensive disclosure policy, and rates a very high score. In this case, the volume of information corresponds with the DQUAL score. Note, however, that this is not a generalizable relationship: MD&As containing a great deal of boilerplate will have lengthy disclosures that will not engender a higher DQUAL than MD&As that are well written, succinct, and transparent. Likewise, MD&As that simply summarize the information available on the face of the financials will not receive credit, as the information is not incremental. The key with the use of this Index is reflected in the SEC’s consistent guidance regarding the construction of the MD&A: simply repeating the numbers is not sufficient; the information must be
192
LORI HOLDER-WEBB
substantive and incremental to what is available through the audited financials. Provision of point estimates and forward-looking information is especially encouraged, as reflected in the higher points available to firms that provide this information in addition to the discussion of current results. According to Marston and Shrives (1991), if it is not possible to construct a disclosure index where all items are applicable to all firms, index scores may be converted to relative scores by dividing the score obtained by the maximum score possible for the firm. Accordingly, the final disclosure score (DQUAL) can be considered as the ratio of disclosure opportunities taken to the total disclosure opportunities available, and may assume any value between 0 and 1. Technically, this technique yields an ordinal-level score. However, Marston and Shrives (1991) review a range of econometric theories and conclude that if the underlying ordinal data is normally distributed, it may be subjected to parametric tests such as those described later. The disclosure score data from the validation process (discussed below) are approximately normally distributed, with a mean of .319 and a standard deviation of .12. Therefore, the DQUAL score should be considered suitable for use with standard econometric techniques such as regression (keeping in mind that the resulting variable is bounded).
Validation of the Index Construct and Content Validity Validity indicates the extent to which an instrument measures what it is purported to measure; in this study, validity pertains to whether the MD&A Index effectively measures the content of the MD&A in a manner that is complete and replicable. This requires establishing the content and construct validity of the measure, as well as the consistency of results of applying the measure between evaluators (inter-rater reliability). Content validity is the degree to which an instrument or operationalized variable maps against the relevant content domain (Trochim, 2000). In the case of MD&A content, the content domain is essentially established by the SEC, who set the reporting requirements for all publicly traded firms on U.S. exchanges. The MD&A is generated in response to an SEC requirement to provide this information in a well-defined report. Thus, the specific details of the content domain can be derived from the numerous SEC statements of requirements, preferences, and clarifications that have been published over the last two decades (SEC, 1987, 1989, 2001, 2003). The Index items, as described above, are drawn both from a practitioner checklist for
Providing a Tool for Evaluating Managements’ Discussion and Analysis
193
MD&A and from the SEC pronouncements involving the required and desired content. To the extent that the responsibility for determining the reporting guidelines ultimately rests with the SEC, the criteria that constitute the content domain are definitively established. To the extent that the MD&A Index is drawn directly from that material, the Index possesses content validity. Construct validity determines the degree to which inferences can be made from the variable arising from the measurement process (Trochim, 2000). For the purposes of establishing the construct validity of the DQUAL score, this study follows the methodology established by Botosan (1997) who evaluates the construct validity of her measure of disclosure quality by benchmarking against multiple extant measures of factors related to disclosure quality. To this end, a random sample of 50 Compustat firms between 1989 and 1994 and representing a spectrum of financial health, firm size, and industry was obtained.7 See Table 1 for a description of the validation sample. One measure of the information environment of a firm – of which disclosure quality is an element – is the amount of information provided in the public news domain. Therefore, following Botosan (1997), the public news domain is measured by counting the number of articles published about the validation sample firm in the Wall Street Journal during the year. For the validation sample, the Spearman correlation coefficient is positive and significant at the .07 level.8 Botosan (1997) also cites a set of variables (derived from a meta-analysis by Ahmed, 1995) as having positive relationships to disclosure quality: firm size, exchange listing status, audit firm size, and leverage. In the interests of consistency with the prior research, the study follows Botosan’s (1997) operationalization of these variables, which should be positively related to the DQUAL score. Results are reported in Table 2 and are substantially similar to those reported by Botosan (1997) for MVE and STATUS. AUDITOR is also positively related to DQUAL.9 The final step employed by Botosan (1997) in establishing that her measure converges with other known measures of disclosure quality is to perform rank-order correlations between her measure and the only other extant, contemporaneous, and generally-accepted measure of disclosure quality, the AIMR reports.10 Following this, rank-order correlations are run between the DQUAL measure and that of Botosan (1997), and between the DQUAL measure and the AIMR ratings. Positive relationships are expected for both. Results are reported in Table 3. The rank-order correlation of the MD&A DQUAL measure with Botosan’s (1997) DSCORE is positive and
194
LORI HOLDER-WEBB
Table 1. Description of Validation Sample. Panel A: Validation sample by industry Industry
N
Broadcast communications Electronics manufacturing Entertainment General manufacturing Paper products Real estate Retail Steel refining and milling Natural gas/electricity retailing
3 11 4 10 6 1 10 1 4
Panel B: Validation sample by year Year
N
1989 1990 1991 1992 1993 1994
3 5 6 6 6 24
Panel C: Descriptive statistics for validation sample Variable
Mean
Median
Min
Max
MVE DE_RAT
1417.18 3.95
66.20 .83
1.00 3.71
20855.00 131.94
Exchange Auditor
NYSE or AMEX Big 6
N 30 43
NASDAQ or OTC Non big 6
N 20 7
The significantly larger sample from 1994 is due to the inclusion of firms providing AIMR
reports, which are available only through 1994.
MVE is the market value of equity for the firm measured at the end of the year in which the
disclosure data was collected (1989–1994). DE_RAT is the ratio of long-term debt to stockholder’s equity at the end of the year.
significant, as expected.11 However, the rank-order correlation of DQUAL with the AIMR reports is not significantly different from zero.12 Reliability A necessary step in the development of an index for evaluating content is to establish the ability of multiple individuals to use the index with reliable
Providing a Tool for Evaluating Managements’ Discussion and Analysis
Table 2.
195
Correlation Analysis of DQUAL and Firm Characteristics Related to Disclosure Quality.
Correlation p-value N
MVE
DE_RAT
STATUS
AUDITOR
.413 .002 48
.068 .322 49
.467 .000 50
.208 .080 50
One-tailed test.
**MVE is the market value of equity for the firm measured at the end of the year in which the disclosure data was collected (1989–1994). DE_RAT is the ratio of long-term debt to stockholder’s equity at the end of the year. STATUS is an indicator variable assuming a value of 1 if the firm is traded on the NYSE or the AMEX, and 0 if the firm is traded on the NASDAQ or OTC. AUDITOR assumes a value of 1 if the firm had a big-6 auditor during the year, 0 otherwise.
Table 3.
Correlation p-value N
Correlation Analysis of DQUAL with Other Known Quantitative Measures of Disclosure Quality. Botosan (1997) DSCORE
AIMR Rating
.497 .003 30
.032 .447 20
One-tailed tests.
(consistent) results. One way to evaluate this reliability is through statistical comparisons of the inter-rater consistency. To this end, ten reports, each from a different company, were coded according to the Index by two different raters. Both raters were graduate students in accounting, and both received training in the use of the Index. Training consisted of a discussion of the Index items and examples of each (most of which are included in Appendix B) and directions to canvass the other reports available to identify potentially inapplicable (NA) items. Cohen’s Kappa (Cohen, 1960) is a conservative measure of inter-rater agreement and measures the proportion of agreement between raters on individual items after taking into consideration the probability of agreement by random chance. The Cohen’s Kappa (Cohen, 1960) value for the agreement between these two coders is .802, significant at po.000, indicating that the raters agreed approximately 80% of the time, excluding agreement by random chance.13
196
LORI HOLDER-WEBB
CONCLUSION This paper provides a tool that can be used to translate qualitative information provided in the MD&A into a quantitative measure that can be used in a variety of experimental and other research contexts, including as a dependent or independent variable in regression analysis. The ability to do so will benefit researchers who are interested in exploring the effects of changing SEC regulation or the Enhanced Business Reporting initiative on this issue, as well as the investor processing and overload issues suggested by Paredes (2003). The tool permits evaluation of any publicly traded company, unlike the size-biased and limited analysts ratings provided through the AIMR reports. It is also insensitive to variance in the firm’s choice of reporting format, unlike the Botosan Index (Botosan, 1997), and thus permits evaluation of disclosure behavior across time when reporting formats vary, or between firms that use different reporting formats. For example, Cohen, Gaynor, Krishnamoorthy, and Wright (2007) in a study on auditor and audit committee communications identify a number of potential studies involving MD&A quality. They suggest, among other issues, that it may be desirable to ascribe additional responsibility for oversight of the MD&A to the audit committee and the external auditors. The MD&A Index would be useful in constructing sample MD&As of rigorous but varying quality in order to examine in an experimental context whether differences in MD&A quality affect the type and/or extent of the communications between audit committees and the auditors. Another opportunity for behavioral researchers would be to evaluate whether the quality of the constructed MD&As affects auditor’s overall risk assessment for the firm. The tool could also be used in an experiment to assess the degree to which individual characteristics (such as experience, training, education, etc.) affect a person’s perception of the quality of an MD&A. Actual published MD&As could be scored with the Index to determine the intrinsic quality of the disclosures, and then disseminated (minus the scores) to experimental participants who are asked to rate the quality of the various disclosures. An interesting take on such an experiment could involve the deliberate injection of extraneous verbiage or redundant information in an effort to determine whether participants are sensitive to the length rather than the quality of disclosures. The new disclosure environment demanded by the Sarbanes-Oxley Act and other government regulatory-disclosure reforms has the potential to incite investor information overload (Paredes, 2003). This concern is evidently shared by the EBRC, which states that ‘‘the usefulness of a disclosure should
Providing a Tool for Evaluating Managements’ Discussion and Analysis
197
be judged in the context of all existing disclosures, as unnecessary disclosure can be damaging or misleading if relevant information is obscured’’ (EBRC, 2005). Paredes (2003) specifically calls for empirical research on actual investor information demands and processing abilities in response to these concerns. The ability to evaluate cognitive overload is contingent upon a more in-depth understanding of the information that is provided by the corporation than is available through much of the extant research. This developing field of study should benefit from the ability to use the MD&A Index to translate qualitative elements of information into quantitative measures that can be used in large-scale studies. While the primary audience for this paper is academic, the tool may also be useful for business or financial analysts. The MD&A is to provide, at the direction of the SEC, managers’ insight into the current operations of the company. The tool yields a quantitative score that reflects the percentage of disclosure opportunities (with respect to the MD&A) that the managers of the firm chose to exercise and thus permits objective comparison between firms. The tool may be useful for investors and analysts who wish to track the quality of a given firm’s disclosures from year to year, or to develop a benchmark for MD&A quality for a given industry and evaluate firms against that. The Sarbanes Oxley Act is predicated on the assumption that more information and more disclosure is better; Paredes (2003) articulates concerns about the degree to which this assumption is true. The regulatory and legislative action has thrust the question of transparency and MD&A disclosure on to center stage for researchers and practitioners. Only by expanding our evaluation of disclosure and information into the firmspecific environment will it be possible to answer these questions.
NOTES 1. The Enhanced Business Reporting Consortium is comprised of the AICPA, the Business Roundtable, Business Wire, National Association of Corporate Directors, the NASDAQ, the National Investor Relations Institute, the Open Compliance and Ethics Group, PR Newswire, and XBRL International. 2. Rich and Solomon (1992) note that in order for the predominantly computerized type of content analysis to effectively evaluate disclosure quality, the evaluated disclosure must possess either a formulaic approach or lend itself to the construction of a determinant vocabulary. 3. For example, please see the Results of Operations section of the Index (in Appendix A), where the primary recording units (i.e., discussion of extraordinary items, revenues, economic climate, etc.) are broken out into discussions of current effects and discussions of future effects. For some items that were heavily
198
LORI HOLDER-WEBB
emphasized by the SEC (e.g., revenue projections), this approach is not practicable and the single item can receive a maximum of 2 points. 4. If the repetition of requests has been driven not by perceived relative importance but by perceived shortcomings in extant practice, this weighting process may introduce noise into the DQUAL measure. 5. The time to complete the instrument is highly variable, and depends on familiarity with the instrument and with MD&A disclosures, reading speed, and the length of a given report. Some (but not all) variance in the length of the report can be explained by the size and/or complexity of the business, and the presence of unusual events during the fiscal year. Firms with a high level of complexity therefore are likely to require more time. Due to the great variation in report length and reading speed, generating meaningful estimates of the length of time that will be required to score a report is impossible. However, the experience of the author and of the other coders who participated in the validation of the instrument is that the length of time required for coding decreases significantly with experience. 6. Points were not awarded by segment, but by the entire MD&A. In general, MD&As are too lengthy to include in this paper; therefore, Appendix B provides a brief snapshot of the process. 7. This sample period is limited by availability of AIMR rating reports. These reports are not available after 1994. 8. This result is consistent with that of Botosan (1997). 9. Botosan (1997) was unable to report results for AUDITOR due to a lack of variability in her sample. Results for DE_RAT are dissimilar from those of Botosan (1997), who reported a significant positive association that does not exist in this validation sample. However, subsample analysis correlating Botosan’s score with the DE_RAT of this sample also does not yield a significant positive correlation; i.e., similar results are observed with respect to leverage with Botosan’s DSCORE and the MD&A DQUAL. Botosan’s (1997) sample is constrained by a number of factors that have not been imposed upon the validation sample; in particular, her sample firms are all very large and have substantial analyst following. These findings suggest, therefore, that the previously observed relationship between her disclosure quality measure and leverage may be limited to large firms with very rich information environments. 10. In addition to this step, Botosan (1997) deploys the measure on a sample of firms in order to evaluate the relationship between disclosure levels and cost of equity capital. This final step in the process ensures that the measure is sensitive to interobservational differences. The MD&A Index has also been deployed in this manner, in a study evaluating changes in disclosure level over time for a group of financiallydistressed firms (Holder-Webb & Cohen, 2007). In that study, the Index demonstrated a sensitivity not only to inter-firm disclosure practices, but also to changes in disclosure practice over time. 11. The moderately strong correlation approaches 50% and is higher than that reported by Botosan (1997) in the validation of her instrument. The magnitude of the correlation coefficient is theoretically limited, as Botosan’s measure takes into consideration the entire Annual Report, while the DQUAL measure evaluates a subset of that information. An extremely strong correlation would suggest that the
Providing a Tool for Evaluating Managements’ Discussion and Analysis
199
remainder of the Annual Report conveys little or no incremental content or quality beyond that contained in the MD&A, an unlikely proposition. 12. This may possibly be a function of the small sample size (due to limitations of data availability for the AIMR reports in this time period). However, Botosan (1997) found that the significant correlations between her measure and the AIMR reports pertained only to sections of her measure, rather than the entire metric; she also found that even the non-parametric correlations appeared to be subject to the influence of extreme observations. 13. Landis and Koch (1977) suggest that a Kappa value of .80 indicates ‘‘Substantial’’ agreement, a category just short of ‘‘Almost Perfect.’’
REFERENCES Ahmed, K. (1995). The effect of corporate characteristics on disclosure quality in corporate annual reports: A meta-analysis. Working paper. Victoria University of Wellington, New Zealand. Botosan, C. A. (1997). Disclosure level and the cost of equity capital. The Accounting Review, 72(July), 323–349. Botosan, C. A., & Plumlee, M. A. (2002). A re-examination of disclosure level and expected cost of equity capital. Journal of Accounting Research, 40(1), 21–41. Bratton, W. W. (2003). Enron, Sarbanes-Oxley and accounting: Rules versus principles versus rents. Villanova Law Review, 48(4), 1023. Bryan, S. H. (1997). Incremental information content of required disclosures contained in Management Discussion and Analysis. The Accounting Review, 72(April), 285–301. Clarkson, P. M., Guedes, J., & Thompson, R. (1999). Evidence that Management Discussion and Analysis (MD&A) is a part of a firm’s overall disclosure package. Contemporary Accounting Research, 16(Spring), 111–134. Cohen, D. A., Dey, A., & Lys, T. (2005). Trends in earnings management and informativeness of earnings announcements in the pre- and post-Sarbanes Oxley periods. Working paper. Northwestern University, Evanston, Illinois. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46. Cohen, J., Gaynor, L. M., Krishnamoorthy, G., & Wright, A. (2007). Auditor communications with the audit committee: Policy recommendations and opportunities for further research. Forthcoming in Accounting Horizons. Diamond, D. W., & Verrecchia, R. E. (1991). Disclosure, liquidity, and the cost of capital. The Journal of Finance, 46(4), 1325–1360. Enhanced Business Reporting Consortium. (2005). The Enhanced Business Reporting Framework, October 2005 draft, available at www.ebr360.org (January 6, 2006). Frazier, K. B., Ingram, R. W., & Tennyson, B. M. (1984). A methodology for the analysis of narrative accounting disclosures. Journal of Accounting Research, 22(1), 318–331. Healy, P. M., Hutton, A. P., & Palepu, K. G. (1999). Stock performance and intermediation changes surrounding sustained increases in disclosure. Contemporary Accounting Research, 16(Fall), 485–520.
200
LORI HOLDER-WEBB
Heyman, J. A. (1989). SEC interprets Managements’ Discussion and Analysis. The CPA Journal, 59(December), 70. Holder-Webb, L. M., & Cohen, J. R. (2007). The association between disclosure, distress, and failure. Journal of Business Ethics, (forthcoming). Hooks, K. L., & Moon, J. E. (1991). Applications in accounting: A checklist for Management Discussion and Analysis. Journal of Accountancy, 172(6), 95–101. Hooks, K. L., & Moon, J. E. (1993). A classification scheme to examine Management Discussion and Analysis compliance. Accounting Horizons, 7(June), 41–59. Kim, O., & Verrecchia, R. E. (1991). Trading volume and price reactions to public announcements. Journal of Accounting Research, 29(2), 302–322. Kim, O., & Verrecchia, R. E. (2001). The relation among disclosure, returns, and trading volume information. The Accounting Review, 76(4), 633–655. Land, M., & Lundholm, R. (1996). Corporate disclosure quality and analyst behavior. The Accounting Review, 71(October), 467–492. Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174. Lang, M., & Lundholm, R. (1993). Cross-sectional determinants of analyst ratings of corporate disclosures. Journal of Accounting Research, 31(Autumn), 246–271. Lang, M., & Lundholm, R. (2001). Voluntary disclosure and equity offerings: Reducing information asymmetry or hyping the stock? Contemporary Accounting Research, 17(Winter), 623–663. Marston, C. L., & Shrives, P. J. (1991). The use of disclosure indices in accounting research: A review article. British Accounting Review, 23, 195–210. Paredes, T. A. (2003). Blinded by the light: Information overload and its consequences for securities regulation. Washington University Law Quarterly, 81(Summer), 417–486. Rich, J., & Solomon, I. (1992). Discussion of ‘‘Whose power prevails in disclosure practices?’’. Auditing: A Journal of Practice and Theory, 11(Supplement), 106–111. Sarbanes-Oxley Act. (2002). Pub. L. No. 107-204, 116 Stat. 745. Securities Exchange Commission. (1987). Concept Release on Management’s Discussion and Analysis of Financial Condition and Results of Operations. 52 FR 13715. U.S. Securities and Exchange Commission, New York. Securities Exchange Commission. (1989). Financial Reporting Release No. 36. Management’s Discussion and Analysis of Financial Condition and Results of Operations; Certain Investment Company Disclosures. May. U.S. Securities and Exchange Commission, New York. Securities Exchange Commission. (2001). Petition to U.S. Securities Exchange Commission for Issuance of Interpretive Release. Filed by Arthur Andersen LLP, Deloitte and Touche LLP, Ernst & Young LLP, KPMG LLP, PricewaterhouseCoopers LLP, and American Institute of Certified Public Accountants. Located at http://www.sec.gov/rules/petitions/ petndiscl-12312001.htm (as of July 1, 2005). U.S. Securities and Exchange Commission, New York. Securities Exchange Commission. (2003). Interpretation: Commission Guidance Regarding Management’s Discussion and Analysis of Financial Condition and Results of Operations. 17 CFR Parts 211, 231, and 241. Release Nos. 33-8350, 34-48960, and FR-72. U.S. Securities and Exchange Commission, New York.
Providing a Tool for Evaluating Managements’ Discussion and Analysis
201
Trochim, W. (2000). The research methods knowledge base (2nd ed.). Cincinnati, OH: Atomic Dog Publishing. Verrecchia, R. E. (2001). Essays on disclosure. Journal of Accounting and Economics, 32(1–3), 97–180.
APPENDIX A. MD&A SCORING INDEX Company Report Year End General
Score Basic
General description of business provided Major products, segments, or subsidiaries (description; relative contribution to sales or results) Specific details about products, including manufacture, shipping, inventory turns, etc. (qualitative; quantitative) Segments and subdivisions (impact on operations; breakdown of impact on operations by segment) NA Customers or markets (description; percentage provided to each customer segment or market) Description of facilities (list primary manufacturing, retail, service centers; discuss capacity) Number of employees Order backlog NA Recent acquisitions, impact on current operations NA
Enhanced
1 1
1
1
1
1
1
1
1
1
1
1 1 1
Results of Operations Loss provisions, explained in terms of an event NA Extraordinary gains or losses, current impact (qualitative; quantitative) NA Extraordinary gains or losses, future impact (qualitative; quantitative) NA Current unusual or infrequent items, not extraordinary (qualitative, with details; quantitative impact on operations) NA Anticipated unusual or infrequent items, not extraordinary (qualitative, with details; quantitative impact on future operations) NA Discontinued operations, impact on current results (qualitative; quantitative) NA Discontinued operations, impact on future results (qualitative; quantitative) NA Economic risks and climate, known (discuss risks in economic climate of firm; quantify impact on current results) Economic risks and climate, anticipated (discussion) Revenues, current trends (explain by volume/price changes; analyze factors) Revenues, anticipated trends (projected trends or discussion of uncertainties that prevent projection) Gross margin/gross profit, current trends (identify and explain trends; discuss as percentage of sales) Gross margin/gross profit, anticipated trends (projected trends or discussion of uncertainties that prevent projection) Cost of sales, cost of goods sold, or operating income, current (discuss; analyze factors) Cost of sales, cost of goods sold, or operating income, anticipated (discuss trend; analyze factors) Inflation, changing prices, and currency translations (quantify impact on current operating results)
Basic
Enhanced
1 1 1
1 1
1
1
1
1
1 1
1 1
1
1
1 1
1
2 1
1
1 1
1
1
1
1
202
LORI HOLDER-WEBB
Capital Resources
Basic Current commitments (list current expenditures or explicitly state that there are none) Sources of funds for current commitments (individually identify sources; provide details about sources such as interest rates, limits, etc) NA Sources of funds for current commitments, trends (known; anticipated) NA Proposed commitments (list proposed; provide schedule or projection) Sources of funds for proposed commitments (identify, including currently unused sources and negotiations underway; provide details as above) NA Time frame (time frame for proposed expenditures or hiatus in spending) Changes in capital resources, items (quantitative description of new or renegotiated lines, stock issues, etc; analyze effect that new or changed source has on current and future expenditures) NA Changes in capital resources, costs (qualitative discussion; discuss impact of costs on current and future operations) NA Debt covenants (details; potential effects) NA Economic climate (effect on ability to finance costs of capital) Business risks, including economic and financial (list significant) Risk management (discuss risk management method; quantitative details; limitations on effectiveness)
Enhanced
1 1
1
1 1
1 1
1
1
1 1
2
1
2
1 1 1
1
1
1
1
Liquidity
Basic
Enhanced
Working capital, known trends (qualitative, include cause; quantitative) 1 1 Working capital, anticipated trends (qualitative, include cause; quantitative) 1 1 Current demands and obligations (list) 1 Sources of funds to meet current demands and obligations (discuss internal and external 1 1 sources; identify disposition of each in current period) Changes in the supply of funds, current and projected effects (quantify current; discuss 1 1 projected) Liquidity effects of balance sheet conditions (current impact; projected trends) 1 1 Liquidity effects of operations (current impact; projected trends) 1 1 Liquidity effects of investing and financing items (current impact; projected trends) 1 1 Liquidity deficiencies (source of deficiency; action taken or planned, or reason why defi1 1 ciency can’t be remedied) NA NA indicates that one of more of the points in this category may not be available and is deducted from the # of total points available. The Disclosure Quality score is computed through the following algorithm: Maximum available points = 82 minus the sum of Non-Applicable items Discussed points = Sum of all points awarded (both Basic and Enhanced) for appropriately discussed items Disclosure quality = Discussed points divided by the maximum available points
Scoring Sheet Guidelines Peruse the Annual Report or 10-K to check for the existence of multiple lines of operations, material acquisitions, unusual items, discontinued operations, capital expenditures and/or material commitments, changes in capital resources (issues or repurchases of stock or bonds, new or retired lines of credit), debt covenants, and liquidity problems. Mark ‘‘NA’’ on the
Providing a Tool for Evaluating Managements’ Discussion and Analysis
203
scoring sheet for conditions, elements, or items that do not exist or are not relevant in the given year, and subtract the total points available for each non-relevant item from the base maximum of 82 points. General (14 Points) Description of Business. Award 1 point if a general description of the company’s business activities is provided. This will not generally appear as a separate section, although some companies delineate the primary sources of their revenues. If the nature of the business is clearly presented in the MD&A text, credit should be awarded. Description of Major Products, Segments, or Subsidiaries. Award 1 point if the major products, segments, or subsidiaries are described. Award 1 additional point for a discussion of the relative contribution of the products, segments, or subsidiaries in the production or sales mix. Often, statements about relative contribution are quantitative (i.e., ‘‘50% of revenues were provided by mainframe sales. An additional 30% of revenues arose from sales of desktop units.’’). In some circumstances an exact quantitative measure is unnecessary for providing information about the relative contribution of the line (for example, ‘‘Substantially all of the revenues for 1994 came from sales of mainframe systems’’ would also receive a point for ‘‘quantitative’’ even though an exact number is not provided). ‘‘Segments’’ for the purpose of this Index are not necessarily the same as the formal segments under the FAS, nor is it the same as the 2- or 3-digit SIC codes in which the company operates. A firm may operate in one formal segment (i.e., computer manufacturing), and yet have several major and discrete product lines (i.e., mainframes, desktops, personal digital assistants). Integrators may manufacture hardware and write software, and provide consulting – this constitutes three distinct lines of business for the purpose of this instrument. Judgment must be employed in identifying these categories; it can be helpful to examine the ‘‘Description of the Business’’ section of the 10-K in order to determine what the major lines of business or product lines are. Specific Details about Products. Award 1 point for a qualitative discussion of manufacturing, shipping, service levels, inventory turns, or other details designated as useful by management for the industry in which the firm
204
LORI HOLDER-WEBB
operates. Award 1 additional point for a quantitative description. Example: A disclosure that is limited to ‘‘Sales of Product X were insignificant in 1992, but are expected to rise as the new factory reaches full production capacity’’ would receive only 1 point, as ‘‘insignificant’’ and ‘‘expected to rise’’ are vague and do not yield information that is useful for future benchmarking. However, ‘‘The Company shipped a total of 1.74, 1.85, and 1.85 net tons of steel bar in 1991, 1990, and 1989 respectively. Over the same period, shipments of rolled sheet steel were 4.2, 4.7, and 4.9 net tons’’ would receive 2 points for the level of detail. It is not necessary that these details be provided for every major product line or operating segment. If the company does it for only one, it will generally be the primary line. Segments and Subdivisions. Award 1 point for discussing the basic impact of different business segments or subdivisions on the results of operation. Award 1 additional point if quantitative sales and operating income or percentage contributions to overall Company income are provided for each major segment or subdivision. For example: ‘‘The Company’s losses for each of the last two years are attributable to its voice processing operations. The Company’s performance analysis segment had net operating income for each of the last three years as did the voice processing segment for fiscal 1989’’ is a basic discussion, stating that the segments had operating income but not what that income was or what percentage of the Company’s overall income the segments contributed. This disclosure would receive a total score of 1. If the disclosure also included the dollar amount of income or loss from each segment, it would receive a total score of 2. These are segments identified as under no. 2, above. If the company has only one segment and no subdivisions, mark this item as NA and subtract 2 points from the maximum available. Description of Customers or Markets. Award 1 point if the primary markets or customers are listed or described. Award 1 additional point if quantitative information is provided about the percentage of business provided to different sectors of the customer base. For example, ‘‘Most of the company’s sales take place in European markets,’’ would receive 1 point, whereas ‘‘The Company provides contract manufacturing services to original equipment manufacturers in the electronics industry, including producers of industrial instruments, computers, communications equipment, and medical devices. In 1993, over 60% of our orders arose from medical device manufacturers’’ would receive a total of 2 points.
Providing a Tool for Evaluating Managements’ Discussion and Analysis
205
Description of Facilities. Award 1 point for discussing manufacturing, retailing, or service center facilities. Award 1 additional point if the capacity of these facilities or discussions of capacity-limiting events or factors is provided. For example, ‘‘The Company owns 53 hamburger restaurants, concentrated in the states of Florida, Georgia, South Carolina, and Alabama’’ would receive a total of 1 point (as this gives no information about what the customer capacity of these units is). However, even in the absence of specific unit capacities, following this statement with the information that ‘‘the summer of 2005 featured numerous major weather events in this region that resulted in the complete destruction of 10 restaurants and temporary closures of another 22 restaurants during the year’’ would result in a total score of 2. It is only necessary to discuss the primary facilities to award credit for this item. Number of Employees. employees.
Award 1 point for providing the number of
Order Backlog. Award 1 point for discussing the order backlog, for stating that there is no backlog, or for mentioning that order backlog is not a significant element for the company. If the company operates in an industry where order backlog is not relevant, subtract 1 point from the maximum available as an NA item. Recent Acquisitions. Award 1 point for analysis of the new acquisition’s effect on current operating results (including material acquisitions in the current and prior period). If there are no recent major acquisitions, subtract 1 point from the maximum available as an NA item. Results of Operations (28 Points) Loss Provisions and Write-Downs. Award 1 point if loss provisions or write-downs are explained in terms of an event. This includes environmental provisions, litigations provisions, and significant inventory write-downs, but does not include provisions/write-downs related to restructuring actions (these are included instead under unusual items, below). If no loss provisions are made, subtract 1 point from maximum available as an NA item. If loss provisions are made but not explained in terms of an event, award zero points. Extraordinary Gains and Losses (Current Impact). Award 1 point for qualitative discussion of the impact of extraordinary items upon current
206
LORI HOLDER-WEBB
operating results. Award 1 additional point if this impact is quantified. If the firm had no extraordinary items during the period, subtract 2 points from the maximum available as an NA item. Extraordinary Gains and Losses (Future Impact). Award 1 point for qualitative discussion of the impact of current or expected extraordinary items upon future results. Award 1 additional point if this impact is quantified. This includes anticipated benefits of NOL carry forwards in future periods. If the firm had no extraordinary items during the period and does not mention the expectation of extraordinary items arising in the next period, subtract 2 points from the maximum available as an NA item. Current Unusual or Infrequent Items, not Extraordinary. Award 1 point for qualitative discussion of the current impact of non-recurring items, including restructuring charges. Award 1 additional point for a quantitative discussion of the impact of these items on current operating results. Discussions of restructuring charges must include information about the extent of the restructuring program and the elements (asset sales, layoffs, etc). If the firm had no non-recurring or unusual items during the period, subtract 2 points from the maximum available as an NA item. If the firm restructured but mentions only the aggregate value of the charge, award 1 point for basic information only. Anticipated Unusual or Infrequent Items, not Extraordinary. Award 1 point for qualitative discussion of the impact of current or future non-recurring items. Award 1 additional point for a quantitative discussion of the expected impact of these items on future operating results. This includes the projected benefit of current restructuring charges, the anticipated but non-accrued costs of litigation, the expected impact of adopting new accounting pronouncements, etc. If the item refers to a future adoption of an accounting pronouncement, it must include details (such as the specific time of adoption and expected financial impact of the adoption) in order to receive basic credit. In order to receive credit for an enhanced disclosure, a projected dollar value for the charge is required. If the firm has no current unusual items and does not make reference to unusual items expected to arise during the next period, subtract 2 points from the maximum available as an NA item. Discontinued Operations (Current Trends). Award 1 point for qualitative discussion of the impact of discontinued operations on current operating
Providing a Tool for Evaluating Managements’ Discussion and Analysis
207
results. Award 1 additional point for quantifying this impact. The impact includes known trends, events, uncertainties, and contingencies that may materially affect the company’s liquidity, financial condition, and results of operations between the measurement date and the date when risks of those operations will be transferred or otherwise terminated. If there are no discontinued operations noted on the financial statements, subtract 2 points from the maximum available as an NA item. Discontinued Operations (Anticipated Trends). Award 1 point for qualitative discussion of the projected impact of discontinued operations on future operating results. Award 1 additional point for quantifying this impact. The impact includes known trends, events, uncertainties, and contingencies that may materially affect the company’s liquidity, financial condition, and results of operations between the measurement date and the date when risks of those operations will be transferred or otherwise terminated. If there are no discontinued operations noted on the financial statements, subtract 2 points from the maximum available as an NA item. Economic Risks and Climate (Known). Award 1 point for discussing the impact of macroeconomic events or the competitive environment on the firm’s current operations. Award 1 additional point for quantifying the impact of the events or environment on current operating results. This encompasses a wide variety of influences, including domestic and foreign recessions, the current regulatory climate where relevant, and for energy providers, the effect of climactic changes and events upon the demand for their products. Economic Risks and Climate (Anticipated). Award 1 point for discussing the potential impact of current or future economic events or trends, or the competitive environment on future operating results. The following items pertaining to revenues, gross margin/gross profit, and cost of sales must be addressed for each operating segment or subsidiary of the firm. If only part of the segments or subsidiaries are covered, assign points based on the percentage of total segments or subsidiaries covered (i.e., if 2 of a total 5 segments or subsidiaries are covered, assign no more than 2/5 the available points in any category. If an n-division enterprise is discussed in the aggregate only, award 1/n points.), operating divisions are determined by examining the president’s letter and the description of the business from the 10 k or the footnotes, not through the footnote ‘‘‘segment’ disclosures’’.
208
LORI HOLDER-WEBB
Revenues (Current Trends). Award 1 point for a basic discussion explaining revenue changes by volume and/or price changes. Award 1 additional point for an enhanced discussion providing an analysis of factors contributing to the volume and/or price changes. For example, ‘‘Sales increased by 12% due to an change in shipping volume for the company’s products’’ warrant 1 point only, while ‘‘Sales rose 4% in 1992 to $1.72 billion as tonnage increased 7%, double the growth rate experienced by the service center industry as a whole. Average selling prices, however, declined by 3% as weakened market conditions prevailed on the East and West Coasts.’’ Warrant 2 points. Revenues (Anticipated Trends). Award 2 points for discussion of anticipated changes, trends, or uncertainties preventing prediction of changes and trends. It is not necessary to provide projections for all operating lines. Gross Margins/Gross Profit (Current Trends). Award 1 point if trends in gross margin or gross profit are identified and explained. Award 1 additional point for discussing changes in gross margin/profit as a percentage of sales. This may be discussed in terms of costs-of-sales, but if so, do not award credit for both this item and the Cost of Sales items below. Gross Margins/Gross Profit (Anticipated Trends). Award 1 point if anticipated trends in gross margin or gross profit are discussed. Costs of Sales, Cost of Goods Sold, Operating Expenses, or Operating Income (Current). Award 1 point if current trends in costs of sales, costs of goods sold, or operating income are discussed. Award 1 additional point for analysis of the factors responsible for the change. Costs of Sales, Cost of Goods Sold, Operating Expenses, or Operating Income (Anticipated). Award 1 point if anticipated trends in costs of sales, costs of goods sold, or operating income are discussed. Award 1 additional point for analysis of the factors responsible for the anticipated change. Inflation, Changing Prices, or Currency Translations. Award 1 point for discussing the impact of inflation, changing prices, or foreign currency translations on current operating results, or stating that these factors have no effect on current results.
Providing a Tool for Evaluating Managements’ Discussion and Analysis
209
Capital Resources Current Commitments. Award 1 point for a list of currently committed capital expenditures or explicit mention that there are no current commitments for capital expenditures. Sources of Funds for Current Commitments. Award 1 point if the sources of funds for the commitments are individually identified. Award 1 additional point if they are done in detail (stating the extent and provisions of open lines of credit, interest rates on debt used to finance the expenditures, etc.). For example, 1 point would be awarded for ‘‘capital expenditures are financed through a combination of cash from operations, a financing arrangement with Foothill Capital, and the proceeds from the issuance of stock’’ while 2 points would be awarded for ‘‘capital expenditures are financed through a combination of cash from operations, the financing arrangement with Foothill (described below), and the proceeds from the issuance of stock y the Foothill Note bears an interest rate of 5.6% and a limit of $50,000,000.’’ If the company explicitly states that there were no capital expenditures and/or are no current commitments for capital expenditures, subtract 2 points from the maximum available as an NA item. Sources of Funds for Current Commitments (Trends). Award 1 point for a discussion of known trends in the sources of funds for current commitments. Award 1 additional point for a discussion of anticipated trends in these sources. For example: ‘‘Capital expenditures are limited by the presence of restrictive agreements in the Company’s financing agreements. In 1994, expenditures were limited to $1.2 million. Unless the Company obtains additional financing, 1995 expenditures will be limited to $750,000’’ would receive 1 point as the discussion does not include information about the likelihood or obtaining additional financing, or the expectations of additional financing. However, ‘‘The Company has filed a shelf registration for 25,000 shares with the SEC, with the intent of retiring or refinancing the current line of credit that is used to fund capital expenditures.’’ would receive 2 points. If the company explicitly states that there are no current commitments for capital expenditures, subtract 2 points from the maximum available as an NA item. Proposed Commitments. Award 1 point for a list of proposed capital expenditures. Award 1 additional point if a dollar-value schedule or projection of capital expenditures is provided. An aggregate dollar value of expenditures is not sufficient; there must be an indication of what activities
210
LORI HOLDER-WEBB
the project expenditure comprises. For example, ‘‘Capital expenditures in 1993 will include maintenance on the company’s plant facilities, and the acquisition of new product lines.’’ receives 1 point while ‘‘Capital expenditures for 1996 are expected to be approximately $1.3 million, and will include the construction of new storage tanks and a petroleum cracking unit.’’ If the company explicitly states that no capital expenditures are planned, subtract 2 points from the maximum available as an NA item. Sources of Funds for Proposed Commitments. Award 1 point if the sources of funds (including plans to obtain funding or a description of currently unused sources) for the proposed commitments are identified. Award 1 additional point if they are done so in detail (stating the extent and provisions of open lines of credit, interest rates on debt used to finance the expenditures, etc.). If the company explicitly states that no capital expenditures are planned, subtract 2 points from the maximum available as an NA item. Time Frame. Award 1 point if a time frame for proposed capital expenditures or the hiatus in spending (for firms that state explicitly that no expenditures are planned) is provided. If the company indicates that there are no planned expenditures, but fails to provide a horizon over which the lack of investment is expected to continue, do not award credit. Changes in Capital Resources (Items). Award 1 point for a discussion of new lines of credit, renegotiated lines of credit, or new or retired equity or debt issues. Award 2 additional points for an analysis of the effect that the change in availability of funds has on current and future operations. For example, ‘‘The Company used the entire proceeds of $3,000,000 from the private placement of 800,000 common shares to retire the outstanding debt under the Foothill Credit Facility’’ would receive a total of 3 points (1 for the information about the number of shares issued and the price per share; an extra 2 for the detailed information about how the proceeds were used). If examination of the financials indicates that there were no changes in equity, public debt, or private credit, subtract 3 points from the maximum available as an NA item. Changes in Capital Resources (Costs). Award 1 point for a qualitative discussion of the costs of new or renegotiated resources. Award 2 additional points if the impact of these costs on current and future results of operations is discussed. For example, ‘‘The Company replaced the 9-year 8% Foothill
Providing a Tool for Evaluating Managements’ Discussion and Analysis
211
Note with a new revolving line of credit from First National Bank bearing 8.25% interest’’ would receive 1 point, while adding, ‘‘This exchange is expected to save the company an estimated $350,000 per year in interest charges’’ or ‘‘The new credit line bears fewer restrictive covenants and should therefore provide a greater degree of flexibility in the Company’s operating environment’’ would earn the extra points for a total of 3. If there were no changes in equity, public debt, or private credit, subtract 3 points from the maximum available as an NA item. Debt Covenants. Award 1 point for providing details on debt covenants, whether or not the company is in default. Award 1 additional point for discussing the potential effect of restrictive covenants on the future investing, financing, or operating environment. If no debt covenants are mentioned in the long-term debt footnote, subtract 2 points from the maximum available as an NA item. Economic Climate. Award 1 point for discussing the effect that changes in the economic climate (i.e., macroeconomic shocks, interest rates, or regulation) would have on the ability of the firm to finance its costs of capital. Business Risks (Including Economic and Financial). Award 1 point for listing the financial and operating risks to which the company is exposed, including risks from changing economic conditions (commodity prices), financial risks, or industry/operating risks. Risk Management. Award 1 point for discussing how the company’s business and economic risks are managed (i.e., hedging, derivative, insurance). Award 1 additional point for quantitative information about risk management methods. Award a third additional point for discussing the limitations on the risk management methods employed. If the company states that no risk management method is in place, award a total of 1 point. Award an additional 2 points if the impact of not having a risk management method in place is discussed. Liquidity Working Capital (Known Trends). Award 1 point for qualitative discussion of known trends in working capital (not cash); this discussion must include causes for trends. Award 1 additional point for quantitative discussion of these trends. For example, ‘‘At March 31, 1996, the Company
212
LORI HOLDER-WEBB
had working capital of $6.2 million compared to $8.5 million at March 31, 1995. The decrease in working capital was principally due to significant operating losses and funding of the development of the family of products’’ would receive a total of 2 points. Working Capital (Anticipated Trends). Award 1 point for qualitative discussion of anticipated trends (must include causes). Award 1 additional point for quantitative discussion of these trends. If the company states explicitly that uncertainties prevent prediction of trends, award 2 points. Current Demands and Obligations. Award 1 point for a list of current demands and obligations. This is not necessarily the same as the commitments represented in the Capital Resources section, although some firms will class them together. Full credit for this item, but not for the current commitments (above) would be awarded for ‘‘The Company’s main liquidity requirements arise from capital expenditures and working capital requirements, including interest payments.’’ Sources of Funds to Meet Current Demands and Obligations. Award 1 point for discussing internal and external sources of liquid assets. Award 1 additional point for identifying the disposition of the liquid assets from all sources. For example, ‘‘The Company has met these requirements over the last three years principally from the incurrence of additional long-term indebtedness and cash provided by operations’’ would warrant a score of 1 while ‘‘The company expects that internally generated cash flows will be adequate to satisfy all current obligations. Excess internally generated funds, as well as the remaining proceeds from the issue of equity, have been used to reduce long term debt levels’’ would receive a total score of 2. Economy of language is possible: for example, ‘‘The $102.8 million of funds generated by operating activities in 1994, along with $57 million of net cash proceeds from the sale of GSN and a combined $6.2 million from other sources were used to curtail $89.3 million of long-term debt and to fund capital expenditures and dividends to stockholders of $56.9 million and $11.6 million, respectively’’ would receive 2 points for Sources of funds to meet current demands and obligations, 1 point for Current demands and obligations 1 point for Liquidity effects of operations (below) 1 point for
Providing a Tool for Evaluating Managements’ Discussion and Analysis
213
Liquidity effects of investing and financing items (below), and 1 point for Sources of funds for current commitments. Changes in the Supply of Funds (Current and Projected Effects). Award 1 point for quantifying the current liquidity effects of changes in the supply of funds (not capital resources, but operating resources such as revolving credit agreements and other short-term loans). Award 1 additional point for discussing the potential effects on future liquidity of current changes in the supply of funds. If the supply of funds has not changed according to the financial statements, subtract 2 from the maximum available as an NA item. Liquidity Effects of Balance Sheet Conditions. Award 1 point for discussion of the current liquidity effects of balance sheet conditions. Award 1 additional point for a discussion of the anticipated future liquidity effects of current balance sheet conditions. Liquidity Effects of Operations. Award 1 point for discussion of the current liquidity effects of income or operations conditions. Award 1 additional point for a discussion of the anticipated future liquidity effects of current income or operations conditions. Liquidity Effects of Investing and Financing Items. Award 1 point for discussion of the current liquidity effects of investing and financing items. Award 1 additional point for a discussion of the anticipated future liquidity effects of current investing and financing items. Liquidity Deficiencies. Award 1 point for identifying the source of current liquidity deficiencies (items giving rise to technical default, actual default, negative or very low working capital, unusually low current ratios, or goingconcern audit opinions, for example). Award 1 additional point for discussing the action taken to remedy the deficiency, the proposed actions to be taken to remedy the deficiency, or an explanation of why the deficiency cannot be remedied. If no liquidity deficiencies exist, subtract 2 from the maximum available as an NA item.
214
LORI HOLDER-WEBB
APPENDIX B. SCORED EXCERPTS OF HIGH AND LOW QUALITY MD&A DISCLOSURES Company 1 (Low Quality Disclosure, Complete Operating Results Section from the MD&A) Continuing Operations Sales decreased 4% from 1990 reflecting the general down turn in the United States economy. Gross profit margins decreased modestly from 34.5% to 34.1%. The decrease was primarily the result of a favorable physical inventory adjustment in 1990, which did not recur in 1991. The increase in selling and administrative expenses was due to three factors: (1) a $280,000 reduction in income from the Company’s pension plans, (2) a $100,000 reduction in interest income for notes from the sale of a division in 1988, and (3) an increase in selling expense for expanding the Company’s markets and setting up a third branch location during 1991. The interest expense charged to continuing operations increased $438,000 because the Company allocated all of its interest expense after September 30, 1991, to its continuing operations. Interest expense for prior periods has been allocated to the discontinued operations based on the assets employed in those operations. Another factor was the increase cost of borrowing which the Company incurred after the termination of its revolving credit agreement. Minority interest in net income of a subsidiary ended because the Company bought-out the former minority shareholder during 1990.
RESULTS OF OPERATIONS SECTION Loss provisions, explained in terms of an event Extraordinary gains or losses, current impact (qualitative; quantitative) Extraordinary gains or losses, anticipated impact (qualitative; quantitative) Current unusual or infrequent items, not extraordinary (qualitative, with details; quantitative impact on operations) Anticipated unusual or infrequent items, not extraordinary (qualitative, with details; quantitative impact on future operations) Discontinued operations, impact on current results (qualitative; quantitative) Discontinued operations, impact on future results (qualitative; quantitative) Economic risks and climate, known (discuss risks in economic climate of firm; quantify impact on current results) Economic risks and climate, anticipated (discussion)
Basic NA NA NA
Enhanced NA NA
0
0
0
0
0 1
0 1
0
0
0
Providing a Tool for Evaluating Managements’ Discussion and Analysis Revenues, current trends (explain by volume/price changes; analyze factors) Revenues, anticipated trends (projected trends or discussion of uncertainties that prevent projection) Gross margin/gross profit, current trends (identify and explain trends; discuss as percentage of sales) Gross margin/gross profit, anticipated trends (projected trends or discussion of uncertainties that prevent projection) Cost of sales, cost of goods sold, or operating income, current (discuss; analyze factors) Cost of sales, cost of goods sold, or operating income, anticipated (discuss trend; analyze factors) Inflation, changing prices, and currency translations (quantify impact on current operating results)
215 0
1
0 1
0
0 0
0
0
0
0
All items are worth 1 point, except for revenue projection, which is worth 2 points. The company was, at the time, engaged in a major ongoing restructuring operation with significant effects on current and future expenses and cash flows. No other provisions or extraordinary items were revealed in a perusal of the Annual Report. Points awarded: 4; points available: 23; DQUAL score for this section: .1739.
Company 2 (High Quality Disclosure, Excerpted Operating Results Section from MD&A) Results of Operations 1991 vs. 1990. The Company recognized a net loss for the year ended December 31, 1991 of $140.9 million, or .76 per share, compared with a net loss in 1990 of $25.8 million, or $1.71 per share. $24.9 million of the 1991 net loss was recognized in the first nine months of the year. The $116 million fourth quarter loss was attributable primarily to adjustments made in (i) the Company’s estimate of the cost to complete two U.S. Navy shipbuilding contracts and (ii) the carrying value of goodwill. Net sales for the year ended December 31, 1991, increased by $24.6 million, or 3.3%, to $776.7 million. Most of the Company’s net sales were generated by activity on shipbuilding contracts with the U.S. Navy, as work progressed under contracts to build (detailed list of current contracts and contract award dates, with revenues broken down by major product/contract category). Decreases in net sales aggregating $108.6 million were recorded on the contracts for the seven T-AOs, two T-AOs, five Jumbo AOs, five LSDs, and thirteen LCACs. Of those declines, $30 million was attributable to the LCAC contract, under which LCACs delivered in 1990 were not replaced by subsequent contract awards. In addition, the Company experienced a $79 million decline in net sales contributed by commercial projects, reflecting the completion in 1990 of the Vidalia floating hydroelectric plant and the substantial completion in 1991 of the New York City floating detention facility. The Company also experienced an $8.8 million decline in net sales of other miscellaneous projects.
216
LORI HOLDER-WEBB
Gross profit declined by $56.9 million in 1991, with the Company incurring a gross loss of $29.2 million as compared to a gross profit of $27.7 million in 1990. The substantial decline in gross profit primarily resulted from significant increases in the estimated costs to complete the contracts to build the seven T-AOs and the three LSD-CVs, which resulted in a charge against income of $52.5 million for those two contracts. Of the total charge, approximately $49.1 million was recorded as an increase in the cost of sales, with the remaining $3.4 million being recorded as a reversal of previously recorded net sales. The reduction in gross profit also reflects an approximate $10 million charge related to the floating detention facility and a $6.5 million net charge related to the final phases of contracts to build two T-AOs and five LSDs, which were completed in the fourth quarter of 1991 and the first quarter of 1992, respectively. Of these charges, $14.6 million was recorded as a reversal of previously recorded net sales, with the remainder being recorded as an increase in the cost of sales. In addition, the gross loss reported reflects certain non-recurring charges taken in the fourth quarter in the aggregate net amount of $5.7 million, representing writedowns of certain real and personal property, arbitration costs, and the reduction of a previously recorded receivable reflecting a proposed settlement of that receivable by the customer. The $74.7 million in charges taken during 1991 were partially offset by gross profits of $17.8 million attributable to other Company operations. (Highly detailed discussion of reasons behind cost overruns, management plan for addressing the sources of the overruns, and related revenue and expenditure projections related to the source of the overruns.) Selling, general and administrative expense (SG&A) increased by approximately $3 million, or 10%, in 1991, principally due to $2 million of expenses associated with the start-up of the Company’s GRP division, which constructs the MHC-51 Coastal Minehunters; increases in legal and professional fees of approximately $1.1 million, principally because of increased tax advisory expense and increased litigation activity on environmental and other matters; financing costs of approximately $500,000 incurred in 1991 principally related to the renegotiation of Avondale’s principal credit facility and the financing of the Coastal Minehunter facility; and $1.5 million associated with the expansion of the Company’s operations by virtue of the Crawford and Genco acquisitions. These increases in SG&A expenses were offset by decreases of $900,000 in salary expense and $1.1 million in other miscellaneous expenses. The Company recorded a $57.6 million charge to reduce the carrying value of its goodwill to its estimated value on December 31, 1991. The charge reflected management’s determination that the value of this intangible asset
Providing a Tool for Evaluating Managements’ Discussion and Analysis
217
had been impaired because of the substantial operating losses experienced by the Company in recent years as well as the reductions in and uncertainties related to future levels of defense spending as discussed below. The increase in interest expense of $1.2 million reflects increased borrowings by Avondale under its principal credit facilities. The decline in other income of $2.3 million was attributable to a reduction of Avondale’s overall level of investment and lower rates of return on Avondale’s invested funds as interest rates declined. The reasons for Avondale’s increased borrowings and lower levels of investment are discussed under the ‘‘Liquidity and Capital Resources’’. (Text omitted here from this excerpt includes a detailed discussion of the economics of the shipbuilding industry, the probable trend in government contract revenues overall and for the Company, including point estimates of anticipated and potential contracts. Also omitted is additional detailed discussion of the effects of pending legislation on the Company’s business, projections of worldwide industry trends, and discussion of factors with the potential to limit the ability of the Company to compete effectively in the current operating environment.) RESULTS OF OPERATIONS SECTION Loss provisions, explained in terms of an event Extraordinary gains or losses, current impact (qualitative; quantitative) Extraordinary gains or losses, anticipated impact (qualitative; quantitative) Current unusual or infrequent items, not extraordinary (qualitative, with details; quantitative impact on operations) Anticipated unusual or infrequent items, not extraordinary (qualitative, with details; quantitative impact on future operations) Discontinued operations, impact on current results (qualitative; quantitative) Discontinued operations, impact on future results (qualitative; quantitative) Economic risks and climate, known (discuss risks in economic climate of firm; quantify impact on current results) Economic risks and climate, anticipated (discussion) Revenues, current trends (explain by volume/price changes; analyze factors) Revenues, anticipated trends (projected trends or discussion of uncertainties that prevent projection) Gross margin/gross profit, current trends (identify and explain trends; discuss as percentage of sales) Gross margin/gross profit, anticipated trends (projected trends or discussion of uncertainties that prevent projection) Cost of sales, cost of goods sold, or operating income, current (discuss; analyze factors) Cost of sales, cost of goods sold, or operating income, anticipated (discuss trend; analyze factors) Inflation, changing prices, and currency translations (quantify impact on current operating results)
Basic 1 NA NA 1
Enhanced NA NA 1
1
0
NA NA
NA NA
1
1
1 1
1
2 1
1
0 1
1
1
1
0
All items are worth 1 point, except for revenue projection, which is worth 2 points. The company was, at the time, engaged in a major ongoing restructuring operation with significant effects on current and future expenses and cash flows. No other provisions or extraordinary items were revealed in a perusal of the Annual Report. Points awarded: 17; Points available: 20; DQUAL score for this section: .85.
This page intentionally left blank