ADVANCES IN ACCOUNTING BEHAVIORAL RESEARCH
VOLUME 12
ADVANCES IN ACCOUNTING BEHAVIORAL RESEARCH EDITED BY
VICKY ARNOLD Kenneth G. Dixon School of Accounting, University of Central Florida, USA and Department of Accounting and Business Information Systems, The University of Melbourne, Australia ASSOCIATE EDITORS
B. DOUGLAS CLINTON Northern Illinois University, USA
ANNE LILLIS University of Melbourne, Australia
ROBIN ROBERTS University of Central Florida, USA
CHRIS WOLFE Texas A&M University, USA
SALLY WRIGHT University of Massachusetts Boston, USA
United Kingdom – North America – Japan India – Malaysia – China
ADVANCES IN ACCOUNTING BEHAVIORAL RESEARCH Series Editor: Vicky Arnold Recent volumes: Volumes 1–4: Volumes 5–12:
edited by James E. Hunton edited by Vicky Arnold
JAI Press is an imprint of Emerald Group Publishing Limited Howard House, Wagon Lane, Bingley BD16 1WA, UK First edition 2009 Copyright r 2009 Emerald Group Publishing Limited Reprints and permission service Contact:
[email protected] No part of this book may be reproduced, stored in a retrieval system, transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without either the prior written permission of the publisher or a licence permitting restricted copying issued in the UK by The Copyright Licensing Agency and in the USA by The Copyright Clearance Center. No responsibility is accepted for the accuracy of information contained in the text, illustrations or advertisements. The opinions expressed in these chapters are not necessarily those of the Editor or the publisher. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN: 978-1-84855-738-3 ISSN: 1475-1488 (Series)
Awarded in recognition of Emerald’s production department’s adherence to quality systems and processes when preparing scholarly journals for print
LIST OF CONTRIBUTORS Vicky Arnold
Dixon School of Accounting, University of Central Florida, Orlando, FL, USA and University of Melbourne, Australia
Dennis Bline
Accounting Department, Bryant University, Smithfield, RI, USA
Siew H. Chan
Department of Accounting, Washington State University, Pullman, WA, USA
Natalie Tatiana Churyk
Department of Accountancy, Northern Illinois University, DeKalb, IL, USA
B. Douglas Clinton
Department of Accountancy, Northern Illinois University, DeKalb, IL, USA
Craig Emby
Faculty of Business Administration, Simon Fraser University, Burnaby, BC, Canada
Heather M. Hermanson
Department of Accounting, Kennesaw State University, Kennesaw, GA, USA
Mary C. Hill
Department of Accounting, Kennesaw State University, Kennesaw, GA, USA
Susan H. Ivancevich
University of North Carolina Wilmington, Wilmington, NC, USA
Frances A. Kennedy
Clemson University, School of Accountancy and Legal Studies, Clemson, SC, USA vii
viii
LIST OF CONTRIBUTORS
James M. Kohlmeyer, III
East Carolina University, College of Business, Greenville, NC, USA
Chih-Chen Lee
Department of Accountancy, Northern Illinois University, DeKalb, IL, USA
Dana R. Lowe
Accounting Department, Bryant University, Smithfield, RI, USA
Wilda F. Meixner
Accounting Department, Texas State University, San Marcos, TX, USA
Hossein Nouri
Department of Accounting and Information Systems, The College of New Jersey, Ewing, NJ, USA
Robert J. Parker
College of Business Administration, University of New Orleans, New Orleans, LA, USA
Brad A. Schafer
School of Accountancy, Georgia State University, Atlanta, GA, USA
Jennifer K. Schafer
Department of Accounting, Kennesaw State University, Kennesaw, GA, USA
Joann Segovia
Winona State University, Winona, MN, USA
Steve G. Sutton
Dixon School of Accounting, University of Central Florida, Orlando, FL, USA and University of Melbourne, Australia
Lee J. Yao
The Butt College of Business, Loyola University New Orleans, New Orleans, LA, USA
REVIEWER ACKNOWLEDGMENTS The Editor and Associate Editors at AABR thank the many excellent reviewers who volunteered their time and expertise to make this an outstanding publication. Publishing quality papers in a timely manner would not be possible without their efforts.
Mohammed Abdolmohammadi Bentley College, USA
Christine Haynes University of West Georgia, USA
John Anderson San Diego State University, USA
Karen L. Hooks Florida Atlantic University, USA
Jean C. Bedard Bentley University, USA
Kip Krumweide Boise State University, USA
Richard Brody University of New Mexico, USA
Jordan Lowe Arizona State University, USA
Erin Burrell University of Central Florida, USA
Nace Magner Western Kentucky University, USA
Charles Cho Concordia University, Canada
Ted Mock University of Southern California, USA
Mary Curtis University of North Texas, USA Naman Desai University of Central Florida, USA
Carolyn Strand Norman Virginia Commonwealth University, USA
Carlin Dowling The University of Melbourne, Australia
William Pasewark Texas Tech University, USA
Michael Favere-Marchesi Simon Fraser University, Canada Amy Hageman University of Central Florida, USA
Jillian Phillips University of San Diego, USA ix
x
REVIEWER ACKNOWLEDGMENTS
Katrina Schafie University of Central Florida, USA
Kristen Wentzel Temple University, USA
Steve Sutton University of Central Florida, USA
Tim West University of Arkansas, USA
Greg Trompeter Boston College, USA
Arnold Wright Boston College, USA
EDITORIAL POLICY AND SUBMISSION GUIDELINES Advances in Accounting Behavioral Research (AABR) publishes articles encompassing all areas of accounting that incorporate theory from and contribute new knowledge and understanding to the fields of applied psychology, sociology, management science, and economics. The journal is primarily devoted to original empirical investigations; however, literature review papers, theoretical analyses, and methodological contributions are welcome. AABR is receptive to replication studies, provided they investigate important issues and are concisely written, and is receptive to methodological examinations that can potentially inform future behavioral research. The journal especially welcomes manuscripts that integrate accounting issues with organizational behavior, human judgment/decision making, and cognitive psychology. Manuscripts will be blind-reviewed by two reviewers and an associate editor. The recommendations of the reviewers and associate editor will be used to determine whether to accept the paper as is, accept the paper with minor revisions, reject the paper, or to invite the authors to revise and resubmit the paper.
MANUSCRIPT SUBMISSION Manuscripts should be forwarded to the editor, Vicky Arnold, at
[email protected] via e-mail. All text, tables, and figures should be incorporated into a Word document prior to submission. The manuscript should also include a title page containing the name and address of all authors and a concise abstract. Also, include a separate Word document with any experimental materials or survey instruments. If you are unable to submit electronically, please forward the manuscript along with the experimental materials to the following address: Vicky Arnold, Editor Advances in Accounting Behavioral Research Kenneth G. Dixon School of Accounting University of Central Florida P. O. Box 161400 Orlando, FL 32816-1400, USA xi
xii
EDITORIAL POLICY AND SUBMISSION GUIDELINES
References should follow the APA (American Psychological Association) standard. References should be indicated by giving (in parentheses) the author’s name followed by the date of the journal or book; or with the date in parentheses, as in ‘suggested by Canada (2005).’ In the text, use the form Hageman et al. (2006) where there are more than two authors, but list all authors in the references. Quotations of more than one line of text from cited works should be indented and citation should include the page number of the quotation; e.g. (Phillips, 2001, p. 56). Citations for all articles referenced in the text of the manuscript should be shown in alphabetical order in the Reference list at the end of the manuscript. Only articles referenced in the text should be included in the Reference list. Format for references is as follows: For Journals Dunn, C. L., & Gerard, G. J. (2001). Auditor efficiency and effectiveness with diagrammatic and linguistic conceptual model representations. International Journal of Accounting Information Systems, 2(3), 1–40. For Books Ashton, R. H., & Ashton, A. H. (1995). Judgment and decision-making research in accounting and auditing. New York, NY: Cambridge University Press. For a Thesis Smedley, G. A. (2001). The effects of optimization on cognitive skill acquisition from intelligent decision aids. Unpublished doctoral dissertation, University. For a Working Paper Thorne, L., Massey, D. W., & Magnan, M. (2000). Insights into selectionsocialization in the audit profession: An examination of the moral reasoning of public accountants in the United States and Canada. Working paper. York University, North York, Ontario. For Papers from Conference Proceedings, Chapters from Book, etc. Messier, W. F. (1995). Research in and development of audit decision aids. In: R. H. Ashton & A. H. Ashton (Eds), Judgment and Decision Making in Accounting and Auditing (pp. 207–230). New York: Cambridge University Press.
THE ROLES OF ORGANIZATIONAL JUSTICE AND TRUST IN A GAIN-SHARING CONTROL SYSTEM Frances A. Kennedy, James M. Kohlmeyer III, and Robert J. Parker ABSTRACT This study examines the roles of organizational justice and trust in a specific type of management control system (MCS), gain-sharing. According to the proposed theory, employee perceptions involving the procedural and distributive justice of the gain-sharing plan influence employee trust in managers. Positive perceptions of fairness lead to high trust, which, in turn has positive consequences for the organization such as lower employee turnover. To investigate these issues, a survey was administered to employees of a large manufacturing company. Results of structural equation modeling indicate that employee perceptions regarding the fairness of the gain-sharing plan are positively related to employee trust in managers. Further, trust is linked to employee turnover intentions. The results imply that the organizational justice of an MCS has consequences for the attitudes and behaviors of employees and thus the success or failure of the MCS.
Advances in Accounting Behavioral Research, Volume 12, 1–23 Copyright r 2009 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1108/S1475-1488(2009)0000012003
1
2
FRANCES A. KENNEDY ET AL.
INTRODUCTION Organizational control is a critical management function that has been extensively examined in accounting and other business literatures. In accounting, a number of researchers have emphasized that organizational control involves influencing human behavior (e.g., Ansari, 1977; Flamholtz, 1979; Flamholtz, Das, & Tsui, 1985; Merchant, 1985; Birnberg & Snodgrass, 1988). As Merchant (1985, p. 4) argues, control involves managers taking steps to help ensure that human beings do what is best for the organization. Flamholtz (1979, p. 290) defines an organizational control system as a set of mechanisms, which are designed to increase the probability that people will behave in ways that will lead to the attainment of organizational objectives. The mechanisms by which organizations influence behavior include a number of subsystems that involve planning, operations, performance measurement, and evaluation/rewards systems (Flamholtz, 1983). Some accounting researchers argue that management control systems (MCSs) constitute a subset of organizational control systems as, in their view, it primarily involves the control of mid-level managers by top-level executives (e.g., Fisher, 1998). Other researchers (e.g., Otley, 1994; Langfield-Smith, 1997) propose that this definition of MCS is too narrow for contemporary organizations as lower-level employees are increasingly empowered to participate in activities of strategic significance. In this broader approach, which is adopted in the current paper, the term MCS is often used interchangeably with organizational control. As Chenhall (2003) notes, studies of MCS abound in the accounting literature and the dominant paradigm is the contingency approach. In this approach, the effectiveness of MCS is theorized to depend upon the match between system design and a variety of contingency variables such as technology, environment, and structure. As Chenhall (2003) argues, MCS success also may depend upon a number of psychology variables, which have been relatively ignored in previous research, such as organizational justice and trust between employees and managers. Accordingly, these psychology variables may help explain how individuals react to MCS and, consequently, the effectiveness of such systems. The current study examines the roles of organizational justice and trust in a specific type of MCS, gain-sharing. In gain-sharing plans, workers share productivity gains with employers through a bonus system. As Chenhall and Langfield-Smith (2003) note, a gain-sharing plan is an MCS with a formal and detailed system for measuring and rewarding performance that encourages worker involvement. The current study examines employee
Roles of Organizational Justice and Trust in a Gain-Sharing
3
perceptions of the fairness (i.e., organizational justice) of the gain-sharing plan. According to the proposed theory, employee perceptions regarding the procedural and distributive justice of the gain-sharing plan influence employee trust in managers. No prior studies, to the knowledge of the authors, have examined how the perceived organizational justice of an MCS such as gain-sharing affects trust. According to the theorized framework, positive fairness perceptions of the gain-sharing plan lead to high trust in managers because workers, through an attribution process, associate plan fairness with the managers who implement the plan. High trust, in turn, has positive consequences for the organization such as lower employee turnover (see meta-analysis by Dirks & Ferrin, 2002). To investigate the proposed research issues, a field study was conducted in a large manufacturing plant in the Southeastern United States with annual sales of approximately $95 million. A survey was distributed to all full-time employees. There were 345 usable responses (76% of workforce). Structural equation modeling was used to analyze the hypothesized relations. Results reveal that employee perceptions regarding the fairness of the gain-sharing plan are positively related to employee trust in managers. Further, trust is linked to employee turnover intentions. The results imply that the organizational justice of an MCS has consequences for the attitudes and behaviors of employees and thus the success or failure of the MCS. Accountants who are involved in the design or implementation of MCS may need to consider fairness and trust issues. The remainder of this paper is structured as follows. Theoretical Development and Hypotheses section provides an overview of the related literature as the basis for developing three hypotheses. Next section discusses the organizational background of the company examined in this study. Then, data collection is explained in Research Method section. Next section presents the results, and the last section provides a summary and discussion.
THEORETICAL DEVELOPMENT AND HYPOTHESES Gain-Sharing and Organizational Justice Gain-sharing plans are increasingly popular in the United States (Welbourne & Gomez-Mejia, 1995; Mangel & Useem, 2000). Historically, gain-sharing first appeared in manufacturing plants during the 1930s; subsequently, by the last decades of the twentieth century, gain-sharing had
4
FRANCES A. KENNEDY ET AL.
spread to a variety of service organizations (Welbourne, 1998). At least one large public accounting firm has experimented with it (Bowie-McCoy, Wendt, & Chope, 1993). While plan details vary from firm to firm, several common characteristics have been identified by researchers (e.g., Welbourne & Gomez-Mejia, 1995; Welbourne, Balkin, & Gomez-Mejia, 1995; Gomez-Mejia, 2000). Regarding compensation, gain-sharing plans involve formal and detailed bonus systems for employees (in addition to base pay). Formulae for bonuses typically are based upon productivity measures although other measures involving quality and customer service also are used. Financial gains that accrue to the organization from improvements in performance measures are shared with employees through the bonus system. Another distinguishing feature of gain-sharing is the use of group rewards. Improvements in performance measures result in bonuses for all employees in the gain-sharing unit. As gain-sharing involves performance measurement and employee rewards, issues of organizational justice may be relevant in understanding the impact of gain-sharing on employee attitudes and behavior (Welbourne et al., 1995; Welbourne, 1998). According to a number of researchers (e.g., Brockner & Wiesenfeld, 1996; Welbourne et al., 1995; Folger & Cropanzano, 1998; Welbourne, 1998), the organizational justice literature demonstrates that employees care deeply about the fairness of how their organizations evaluate and reward performance. Regarding evaluation and rewards, many researchers differentiate between two types of fairness or justice: procedural justice and distributive justice. Procedural justice involves the perceived fairness of the procedures or process through which the organization evaluates and rewards employees (Sweeney & McFarlin, 1993). Distributive justice, grounded in equity theory (Adams, 1965), involves the perceived fairness of the outcomes. In assessing distributive justice, individuals evaluate their work inputs (e.g., skills and motivation) relative to the outcomes received from the organization (e.g., pay and promotions). Research has demonstrated that perceptions of procedural and distributive justice are linked to a variety of employee-related outcomes such as organizational commitment, turnover intentions, job satisfaction, and job performance (see meta-analysis by Cohen-Charash & Spector, 2001). In the accounting literature, organizational justice has been used to examine a number of issues, including, most commonly, budgeting. Budget studies have examined fairness issues involving budgetary slack (Fisher, Frederickson, & Peffer, 2002; Libby, 2003) and participative budgeting (Lindquist, 1995; Libby, 1999; Lau & Lim, 2002a, 2002b). Other budgeting studies have investigated the links between the perceived fairness of the
Roles of Organizational Justice and Trust in a Gain-Sharing
5
budgeting system and outcomes such as budget goal commitment (Wentzel, 2002), organizational commitment (Staley, Dastoor, Magner, & Stolp, 2003), budget satisfaction (Maiga, 2006), and supervisory trust (Staley & Magner, 2007). Also several accounting studies have examined organizational justice issues within accounting firms (Parker & Kohlmeyer, 2005; Johnson, Lowe, & Reckers, 2008). Chenhall (2003), in his review of MCS research, argues that organizational justice may be relevant to research in MCS as fairness concerns may influence employees reactions to MCS and hence the effectiveness of MCS. Gain-sharing, as it involves performance measurement and employee rewards, constitutes an MCS (Chenhall & Langfield-Smith, 2003). The current study examines the perceived fairness of the gain-sharing plan and its links to trust in managers.
Trust in Management Employee trust in organizational leaders and managers has been acknowledged as a critical variable in the effective functioning of the workplace by numerous researchers in management and applied psychology (see Dirks & Ferrin, 2002, for an overview of these studies). Although the topic appears less frequently in the accounting literature, several researchers in the field also have noted its importance (e.g., Ross, 1994; Chenhall & LangfieldSmith, 2003). As discussed in several studies (e.g., Bigley & Pearce, 1998; Dirks & Ferrin, 2002), trust has been defined in many ways by different researchers. Bigley and Pearce (1998, p. 407), in a review of studies examining trust within organizations, argue that common to trust conceptualizations is the idea of actor vulnerability. Dirks and Ferrin (2002), in a meta-analysis of employee trust in leadership, reach a similar conclusion. They define trust in the workplace as the willingness of employees to accept vulnerability in their relations with their managers. A number of studies have theorized that organizational justice influences trust in management. As Dirks and Ferrin (2002) point out, in this literature, two referents for management have been examined: direct manager/ supervisor; and organizational/collective leadership. The current study focuses on the relation between organizational justice and trust in the employee’s immediate supervisor. Early studies in this area report links between the procedural and distributive justice of the employee’s performance appraisal and trust in the supervisor who conducts the appraisal (Alexander & Ruderman, 1987; Folger & Konovsky, 1989; Korsgaard & Roberson, 1995). Subsequent studies investigated the relation between interactional justice
6
FRANCES A. KENNEDY ET AL.
and trust in superiors. Interactional justice is the quality of interpersonal interactions between employee and supervisor (Cropanzano, Prehar, & Chen, 2002). Studies that report evidence of a direct relation between interactional justice and supervisory trust include Ambrose and Schminke (2003) and Aryee, Budhwar, and Chen (2002). These studies use social exchange theory to explain the relation between justice and trust. Accordingly, employees have a number of social exchange relations within the organization including relations with both their immediate supervisor and the organization. Within this framework, interactional justice is paramount in employee relations with their superior, whereas the procedural and distributive justice of organizational practice is paramount in the relation between employee and organization. For example, Aryee et al. (2002) argues that interactional justice influences supervisory trust, while procedural and distributive justice of organizational practices influences trust in the organization. Results in Ambrose and Schminke (2003) suggest that this framework may not fully explain the relation between organizational justice and supervisory trust. In their sample, both interactional justice and a global measure of the procedural justice of organizational practice were significant predictors of supervisory trust. As Dirks and Ferrin (2002) argue, based upon their meta-analysis, two differing theoretical frameworks may explain how fairness perceptions influence trust in leadership: (1) the relationship-based perspective; and (2) character-based perspective. In the relation-based perspective, which relies on social exchange theory, the fairness of workplace practices and decisions signal the nature of the relation between employee and leader. In the character-based perspective, which is more relevant to the current study, the fairness of workplace practices and decisions reflects the character of the leader. Accordingly, employees may believe that the fairness of organizational practices reflects the collective character of organizational leadership which, in turn, influences employee trust in leadership. Further, in the mind of the employees, the fairness of organizational practices may reflect upon the character of the employee’s immediate supervisor. As argued in Dirks and Ferrin (2002), based upon attribution theory, employees may attribute the fairness of organizational practices to their immediate superior who implements the practice even when the superior has no involvement in the formulation of the practice. As noted in their subsequent study (Ferrin & Dirks, 2003, p. 19), trust development can be viewed as an attributional process. In this process, an individual infers the trustworthiness of another person based upon that person’s behavior. In interpreting the behavior of another person, the individual must assess whether the behavior reflects that
Roles of Organizational Justice and Trust in a Gain-Sharing
7
person’s character (and hence trustworthiness) or the influence of external forces beyond the control of the person. Prior research suggests that individuals, in assessing the behaviors of others, routinely bias their assessment and attribute behavior to character when external forces are responsible for the behavior (Ferrin & Dirks, 2003). The current study proposes that the perceived fairness of an MCS such as gain-sharing influences employee trust in superiors through the attribution process discussed in Dirks and Ferrin (2002). Although the immediate supervisor is not involved in the decisions of the gain-sharing plan (e.g., the selection of performance measures, determination of compensation formula), through the supervisor’s role as a plan facilitator, the supervisor becomes associated with the plan in the minds of employees. The fairness of the plan becomes linked to the supervisor’s character and ultimately to employee trust in the supervisor. The following hypotheses summarize these arguments: H1. The perceived distributional justice of the gain-sharing plan is linked to employee trust in their managers. H2. The perceived procedural justice of the gain-sharing plan is linked to employee trust in their managers. The hypothesized relations between organizational justice and supervisory trust differ from previous studies in this area. Several prior studies propose that the fairness of performance appraisal is related to trust in the superior who conducts the evaluation (Alexander & Ruderman, 1987; Folger & Konovsky, 1989; Korsgaard & Roberson, 1995). In this setting, the supervisor is directly involved in the evaluation decision that affects the employee. The current study argues that employee trust in their supervisors is influenced by fairness perceptions of the MCS, which is a system that the supervisor does not control. A few other studies (e.g., Ambrose & Schminke, 2003) report that global measures of procedural justice are linked to supervisory trust. Global measures capture the overall perception of the procedural justice of the organization, which, theoretically, covers all organizational practices. Perceptions of incentives/reward systems are mixed with perceptions of all other organizational practices including those in which the immediate supervisor participates in the decision-making that affects the employee (e.g., performance evaluation, work scheduling, and task assignment). Global measures do not identify the specific organizational or supervisor
8
FRANCES A. KENNEDY ET AL.
practices/characteristics that are important to employee perceptions of fairness. The current study focuses on fairness perceptions of a specific organizational feature, the MCS. This focus has relevance for those accountants, such as controllers, who have a role in the design or implementation of employee reward systems.
Consequences of Employee Trust In the management and applied psychology literatures, researchers have theorized that employee trust in management influences workplace outcomes such as job performance, organizational citizenship behaviors, job satisfaction, organizational commitment, and turnover intentions (see metaanalysis by Dirks & Ferrin, 2002). In the accounting literature, Chenhall and Langfield-Smith (2003) emphasize that trust is pivotal in promoting cooperation between employees and managers, which in turn is critical to strategy formulation and execution. Hofstede (1968, p. 266), in a seminal study of budgetary control, argues that the most important characteristic for the manager is perhaps trust, which creates the atmosphere of safety in which the team spirit can operate. The current study focuses on the relation between trust and turnover intentions. Results of prior studies suggest a strong relation between them (see meta-analysis by Dirks & Ferrin, 2002). As Dirks and Ferrin (2002) argue, given the nature of organizational hierarchies the employees are vulnerable to the actions of their managers, that is, employees are at risk regarding their managers. Individuals who do not trust their managers are likely to leave their jobs, presumably to find more trustworthy bosses who will not put them at risk. The related hypothesis appears below: H3. Employee trust in their managers is linked to turnover intentions. The model that corresponds to the hypotheses appears in Fig. 1. As illustrated in the figure, H1 and H2 propose that distributive and procedural justice influence employee trust in managers, while H3 proposes that trust influences turnover. Within this framework, the effect of organization justice on turnover is mediated by trust. Most studies in the organizational justice literature have theorized that organization justice directly affects turnover intentions; further, the results of several meta-analyses indicate a strong correlation between justice and turnover (Cohen-Charash & Spector, 2001; Colquitt, Conlon, Wesson, Porter, & Ng, 2001). This study explores the potential role of supervisory trust in explaining the strong association
9
Roles of Organizational Justice and Trust in a Gain-Sharing
Distributive Justice H1 Trust in Manager Procedural Justice
H3
Intention to Leave
H2
Fig. 1.
Hypothesized Model.
between justice and turnover reported in prior studies. (Aryee et al., 2002 examines trust in the organization, a construct that differs from supervisory trust, as a mediating variable in the relation between organizational justice and turnover intentions.)
ORGANIZATIONAL BACKGROUND The manufacturing plant examined in this study was established in the 1950s by a global Fortune 500 company. This plant manufactures utility hardware. In 2003, the company intended to shut down the facility because of a history of losses. Instead, they sold the plant to the principals of an organization consulting firm. These two principals had established a relationship with the employees through several training initiatives over a number of years. They changed the name of the company and continued production of utility hardware. The principals believed that their consulting firm could provide the experience necessary to assist in the transition. Planned initiatives included establishing teams, implementing 5S and visual factory concepts,1 reorganizing into manufacturing cells, and using Kaizen blitzes to incorporate continuous improvement and JIT techniques. Production averages 180,000 units per year of 4 main products, totaling $94.5 million in sales. The cycle times to complete one unit range from 3 to 6 days, depending on the product. The plant consists of an area of over 677,000 square feet and employs 461 people (380 hourly, 81 salaried, and a temporary workforce). Seventy-seven percent of hourly employees belong to the IBEW union in this right-to-work state and the nonexempt employees belong to the AWEA union. There has been no work stoppage for either union in over 25 years. A new collective bargaining agreement was
10
FRANCES A. KENNEDY ET AL.
negotiated that included annual increases, gain-sharing, and a 5-year agreement to establish stability. Although the new owners had no experience with managing a manufacturing facility, they were experienced change consultants. One of their key concerns was turning a consistent loss position into profit. They implemented the gain-sharing program in order to create an environment in which the employees were more involved and motivated. The owners believed that the goals in the gain-sharing plan would help to both increase workplace stability and leverage employees’ work efforts most effectively.
Turnaround Strategy: Gain-Sharing Program In order to focus on key areas, the principals enacted a gain-sharing program designed to involve employees in ongoing improvement efforts and to share in the rewards. A team of managers, hourly employees, and a union representative identified multiple measurements to monitor progress toward company goals. These became the performance measures used in establishing gain-sharing targets. The gain-sharing team consists of the two principal owners, managers of human resources, operations, quality, and continuous improvement and maintenance. Other members include representatives from major plant areas such as manufacturing support, 5S auditor, union representative, finance, two line workers, and two supervisors from both first and second shifts. The team meets monthly to review results and set the gain-sharing targets for the following month. Setting Goals At monthly meetings, the gain-sharing team reviews the prior months’ results and sets the targets and payout amounts for the following month. The team has flexibility to change or add goals as the business climate warrants. For example, during 2004, the four metrics used consistently through the year were man-hours per unit, end-of-line (EOL) defects, test failures, and on-time complete orders. During April and May, EOL defects were replaced with overdue shipments. The team met in January 2005 and established a revised set of base measures for the new year, as follows: scrap dollars per unit, meeting daily required production, speed (measured as inventory turns and order-to-invoice time), and cash flow (measured as meeting budgeted net cash generated by operations). The goals are set using a process called ‘‘continuous improvement goal setting’’ (CIGS). During
Roles of Organizational Justice and Trust in a Gain-Sharing
11
this process, goal setters are provided with the lowest, highest, and average monthly performance on the target metric. The goal selected is purposely between the highest and the average score. This goal-setting process is used because it is transparent in that workers can see that they not only have achieved the target before but that they have exceeded it. Setting Payout Values The gain-sharing team also adjusts the dollar amount of the monthly payouts to employees. During 2004, the maximum amount of monthly payout per employee was set at $200 and was established with consideration of budget restrictions and employee impact. This maximum payout was raised in 2005 to $225. Dollar amounts for each individual goal were set such that the total payout for all goals was $225. In this way, the gain-sharing team was able to weigh various factors considered key in the current business environment and employees earn the payout amount for each target separately. Therefore, if only one goal is met, they still receive a payout. Potentially, a production floor employee on the base pay scale could earn approximately 10% of their base salary in a gain-sharing bonus. This assumes all goals and gates are met and they are awarded a maximum payout of $200 per month. Actual performance, however, fell short of this level. The total payout per employee during 2004 was $695. Over a period of 20 months, there were 12 payouts ranging from $25 to $125. Plan Administration Supervisors present the monthly performance targets and payout values to employees. At the beginning of each shift, the supervisor and employees meet for five minutes to review any production issues and discuss any other necessary information including the gain-sharing plan. It is at this time that payout values, performance targets, and progress are shared by the supervisor. The gain-sharing team provides the supervisor with a sheet of talking points on which to work.
RESEARCH METHOD Sample and Procedure A survey was used to collect data within one manufacturing facility to examine the hypotheses. Employees were asked to meet in the cafeteria near the beginning or end of their shift to complete the survey during their paid
12
FRANCES A. KENNEDY ET AL.
work hours. The researchers were present to administer the survey, answer any questions about how the data would be used, and ensure the complete anonymity of their responses. When finished, each employee inserted their own survey in a large envelope in the researcher’s possession. The survey was completed by 404 of the 453 full-time employees. For the analyses, 59 respondents were eliminated due to missing responses which resulted in 345 respondents in the final sample (76% of workforce). Twothirds of the sample is male and one-third is female. The majority of the sample employees work on first shift (56%), while the remaining employees work on second and third shift, 28%, 16%, respectively. This is a mature workforce with 41% of the employees having worked at this facility for more than 20 years, 11% between 11 and 20 years, 21% between 5 and 10 years, and only 26% having been there for less than 5 years. The highest educational degree attained by 58% of the sample is high school, 21% have some college credits, while 11% have college degrees. Advanced degrees are held by 4% and the remaining 6% hold technical degrees.
Variable Measurement This study assesses the impact of three latent variables on the intent of employees to leave the company. All four variables were measured with scales that have been used in previous studies. All items were measured on a seven-point scale (1 ¼ strongly disagree and 7 ¼ strongly agree), with higher values indicating a higher level of the construct. Scale items may be found in Table 1. The dependent variable, ‘‘intentions to leave,’’ was measured using a fouritem scale developed by Rosin and Korabik (1991). Trust in manager was measured with a four-item scale that Korsgaard and Roberson (1995) adapted from Cook and Wall (1980). The six-item procedural justice scale was developed by Welbourne et al. (1995) to specifically assess the general rules and administration of a gain-sharing plan. Similarly, for distributive justice, we used a five-item scale developed by Welbourne et al. (1995) who adapted items previously used by Alexander and Ruderman (1987) and Folger and Konovsky (1989). EQS 6.1 was used to conduct confirmatory factor analysis (CFA) to determine the reliability of the measures used in this study. This approach was used to assess whether the measurement model fits the data and to assess the reliability of constructs of the measurement instrument (Anderson & Gerbing, 1988; Campbell & Fiske, 1959). The goodness of fit
13
Roles of Organizational Justice and Trust in a Gain-Sharing
Table 1.
Maximum Likelihood Estimation of the Measurement Model for the Latent Variables.
Latent Variable
Distributive justice DJ1: The size of our bonus is fair DJ2: The bonus we receive is fair DJ3: All in all, the bonus payment is what it ought to be DJ4: Our bonus is fair compared to what others are getting DJ5: The extent to which the bonus gives us the full amount we deserve is fair Procedural justice PJ1: The design of the gain-sharing plan seems fair PJ2: The gain-sharing plan formula is fair to all employees PJ3: The gain-sharing plan is administered fairly PJ4: The rules used for sharing the gain-sharing bonus with all employees are fair PJ5: When determining whether a gain-sharing bonus will be paid, the company uses accurate information about the department’s performance PJ6: The performance level required to receive a gain-sharing bonus is clear to me Trust in management TR1: Taking all things into consideration, I am satisfied with my manager TR2: My manager is honest in his/her dealings with me TR3: I trust my manager TR4: My manager is sincere in his/her attempt to meet my point of view Intent to leave LV1: At this time in my career, I would quit my job if it were feasible LV2: I am actually planning to leave my job within the next 6 months LV3: I am actively searching for another job right now LV4: I have had thoughts about leaving my job
Standardized Factor Loading
0.770 0.898 0.834 0.813 0.786
0.808 0.851 0.860 0.780 0.762
0.589
0.778 0.863 0.871 0.808
0.589 0.688 0.729 0.578
indices, CFI and RMSEA, recommended by Kline (2005) and Hu and Bentler (1999), were used to assess the measurement model fit. The results indicate that the measurement model fits the data reasonably well, with a CFI of .959 and an RMSEA of .059. Internal consistency reliability of the scales was assessed by examining Cronbach alpha scores (Cronbach, 1951; DeVellis, 1991). The lowest score
14
FRANCES A. KENNEDY ET AL.
Table 2. Descriptive Statistics and Correlations among Study Variables. Variable Distributive justice (DJ) Procedural justice (PJ) Trust in managers (TR) Intent to leave (LV)
Mean
Standard Deviation
DJ
PJ
TR
LV
4.06 4.00 4.39 3.40
1.59 1.65 1.67 1.62
.91 .76** .49** .30**
.90 .50** .33**
.91 .40**
.74
Notes: N ¼ 345. Coefficient alpha statistics in bold. **p ¼ o.001.
of .74 was for the ‘‘intent to leave’’ scale, with the remaining three constructs scoring at approximately .90. Also, all factor loadings for the indicator variables were significant at .01 level. This finding supports the convergent validity of the indicators (Anderson & Gerbing, 1988). Table 1 presents the items along with the results of maximum likelihood estimation of the measurement model for the latent variables. Table 2 presents descriptive statistics as well as Pearson correlations and Cronbach Alpha statistics for each variable. All variables are significantly correlated with the highest correlation of .76 between procedural justice and distributive justice. Variance inflation factors (VIF) and tolerance statistics were used to test for multicollinearity among the variables. VIF measures the ratio of total standard variance to unique variance. Kline (2005) suggests that a VIF greater than 10 indicates redundancy in the measures. The maximum VIF for the variables is 2.4, which is well below this limit. The tolerance statistic is an alternative measure that assesses the proportion of total standard variance that is unique. Kline (2005) suggests that a tolerance statistic lower than .10 would indicate potential multicollinearity. The minimum tolerance statistic for the variables is .42, which is well above the .10 threshold. Both the VIF and tolerance statistics indicate that there is no multicollinearity among these variables. In order to further demonstrate that each dimension is distinct, the coefficient alpha in the diagonal should be greater than the correlation coefficients within a column (Churchill, 1979). Table 2 clearly shows that the internal reliability of each latent dimension (Cronbach Alpha) is higher than the interitem reliability (correlation coefficients). There is a concern that the data may suffer from common method variance since self-reported data are used exclusively in this study (Campbell & Fiske, 1959; Podsakoff & Organ, 1986). Because these constructs are perceptions, no alternative sources may be used to validate the data. An exploratory factor analysis using an oblique rotation and including all
15
Roles of Organizational Justice and Trust in a Gain-Sharing
Table 3.
Results of Exploratory Factor Analysis. Panel A: Oblique Rotation
Item Procedural justice 1 Procedural justice 2 Procedural justice 3 Procedural justice 4 Procedural justice 5 Procedural justice 6 Distributive justice 1 Distributive justice 2 Distributive justice 3 Distributive justice 4 Distributive justice 5 Trust in managers 1 Trust in managers 2 Trust in managers 3 Trust in managers 4 Intent to leave 1 Intent to leave 2 Intent to leave 3 Intent to leave 4
Factor 1
Factor 2
Factor 3
Factor 4
.001 .338 .081 .321 .265 .137 .735 .780 .777 .748 .748 .050 .009 .013 .017 .015 .090 .060 .240
.850 .590 .808 .535 .505 .426 .020 .121 .074 .076 .098 .014 .028 .065 .038 .127 .019 .042 .189
.007 .065 .038 .024 .050 .146 .041 .040 .043 .005 .028 .689 .928 .813 .775 .014 .018 .025 .040
.027 .056 .007 .002 .070 .044 .011 .011 .043 .006 .035 .110 .064 .027 .014 .541 .734 .711 .556
Panel B: Percent Variance Explained by Each Factor Factor 1 32.1%
Factor 2
Factor 3
Factor 4
29.0%
23.9%
15.0%
Note: Factor loadings for items in each scale are highlighted in bold.
variables in this study yielded four distinct factors with eigenvalues greater than one. If the majority of variance is explained by the first factor, then there is significant bias (Podsakoff & Organ, 1986). In this analysis, 32.1% of the variance is explained by the first factor and would indicate that any effects of common method variance are minimal. Table 3 presents the results of the exploratory factor analysis. These analyses indicated that all four constructs are separate and distinct.
RESULTS The model depicted in Fig. 1 was estimated using structural equation modeling (SEM) software, EQS 6.1. A review of kurtosis and skewness
16
FRANCES A. KENNEDY ET AL.
reveals that all indicator variables are well below the thresholds recommended by Kline (2005) (10.0 and 3.0, respectively), indicating univariate normality. Multivariate normality is necessary for maximum likelihood estimation of SEM and is assessed with Mardia’s normalized estimate of 24.6, which is below the recommended threshold of 30 (Kline, 2005). Thus, the data are both univariate and multivariate normal.2 Fig. 2 reports the results of the SEM analysis testing the hypotheses. The overall fit statistics of the hypothesized structural model are good with a CFI of .955 and an RMSEA of .060. The standardized coefficients support hypotheses H1 and H2 that predict that employee perceptions of distributive justice and procedural justice positively influence trust in managers ( po.01). (R2 is .327 with trust as endogenous variable.) H3 predicts a negative relationship between trust in managers and turnover intentions is also supported ( po.001). (R2 is .224 with turnover intent as endogenous variable.) To examine the strength of this model, the hypothesized model is compared to the full model that includes direct paths from the two exogenous variables to the dependent variable. The three hypothesized paths remain significant, while the two additional paths are not significant. There is virtually no change in the fit statistics (CFI ¼ .961; RMSEA ¼ .058). This further supports the significance of the strong relations in the hypothesized model. This study suggests a mediating relationship between exogenous variables and the dependent variable. Table 4 presents the decomposition of the direct and indirect effects of the full model. Sobel Tests were performed to determine the significance of the independent variable upon the dependent variable through the mediator (Baron & Kenny, 1986). Results indicate that the indirect effects of both procedural justice and distributive justice on intent to leave are significant, while the direct effects are not significant. This pattern is the strongest evidence of mediation (Kline, 2005). This supports the proposition that employees’ trust in managers mediates the effects of justice perceptions on their intentions to leave employment.
SUMMARY This paper examines the roles of organizational justice and trust in a specific type of MCS, gain-sharing. According to the proposed theory, employee perceptions involving the procedural and distributive justice of the gainsharing plan influence employee trust in management. Positive perceptions
17
Roles of Organizational Justice and Trust in a Gain-Sharing MODEL: Hypothesized
Distributive Justice H1: .31 * H3: -.47***
Trust in Manager
Intention to Leave
H2: .29 * Procedural Justice
MODEL: Full
Distributive Justice -.04 .31* Trust in Manager
-.37***
Intention to Leave
.28* Procedural Justice
-.22
M o d e l F it S ta tistic s : Fit Indices * p < .05 ** p < .01 *** p< .001
Fig. 2.
Chi-Squared Degrees of Freedom CFI RMSEA
Hypothesized Model 329***
Full Model 324***
Independence Model 4232
148
146
171
.955 .060
.961 .058
-
Results of Structural Model Analyses: Hypotheses, Standardized Coefficients, and Fit Indices (N ¼ 345).
18
FRANCES A. KENNEDY ET AL.
Table 4.
Decomposition into Direct and Indirect Effects.
Exogenous Variables
Endogenous Variables Trust in managers
Intentions to leave
Standard error
Standardized parameters
Standard error
Standardized parameters
Procedural justice Direct effect Indirect effects Total effect
.16 – .16
.28*** – .28***
.17 .01 .18
.22 .09*** .31*
Distributive justice Direct effect Indirect effects Total effect
.16 – .16
.31** – .31**
.17 .01 .18
.04 .11*** .15
Trust in managers Direct effect Indirect effects Total effect
– – –
– – –
.02 – .02
.37*** – .37
Note: Significance of standardized effects was tested using unstandardized errors. *po.05. **po.01. ***po.001.
of fairness lead to high trust, which in turn has positive consequences for the organization (e.g., low employee turnover intentions). The results of this study indicate that employee perceptions involving the fairness of the gain-sharing plan are positively related to employee trust in management. Further, the level of employee trust is negatively associated with turnover intentions, indicating a greater level of workforce stability. The relationships of procedural and distributive justice to workforce stability appear to be completely mediated by the degree of employees’ trust in managers. Most prior research has theorized and found direct effects between organizational justice and employee outcomes such as turnover intentions. We find, however, no such direct effects. The current study focuses on the perceived fairness of one type of MCS, gain-sharing. The results may be applicable to other performance measurement and reward systems (e.g., piece-rate systems). If so, accountants should be aware of fairness issues in the design and implementation of MCS as fairness perceptions influence employees and therefore the organization. As several researchers argue (e.g., Flamholtz, 1979; Merchant, 1985), one of the
Roles of Organizational Justice and Trust in a Gain-Sharing
19
fundamental objectives of organizational control is to influence employee behavior so that the organization benefits. Among the limitations of the current study is the use of a survey in only one company. Results may be specific to this company and its gain-sharing plan. Also, as is common in survey research of this type, potential problems with omitted variables may exist. Another limitation is that all the variables are reported by the individual employee and, therefore, are subject to same source bias. Because this study examines the effects of employee perceptions on the employee’s turnover intentions, the implications of this bias are most likely minimal. The length of the full survey may have resulted in fatigue in some respondents. This effect was evidenced in the number of missing responses in the last section of the survey. Consequently, these surveys were dropped from the sample. Possible factors that may influence the findings of this study are the variations and frequent changes to the gain-sharing measures and targets. Because the company established a flexible system that has the ability to change targets and payouts according to economic and customer needs, the frequent payouts and changes may affect employee outcomes. The results in this study reflect the joint effects of many plan variations.3 Future research examining the individual effects of these changes (e.g., payout frequency, amount, levels, and gates) may begin to partition out the impacts of these plan elements. Also, for the different dimensions of the plan, there may be differences between procedural and distributive justice in their effect on outcomes like trust. For some plan features, one type of justice may be relevant whereas the other is not. Some researchers argue that the type of justice that is relevant depends upon the context although the evidence supporting differential effects is mixed (see meta-analysis by Colquitt et al., 2001). The current study could be extended in a number of other ways. The current study examines the impact of justice perceptions and employee trust upon only one employee outcome, turnover intentions. Other employee outcomes could be investigated such as job satisfaction, organizational commitment, and organizational citizenship behaviors. Another potential outcome is organizational performance. As Chenhall and Langfield-Smith (2003) argue, employee trust fosters cooperation between management and employees which, in turn, fosters high organizational performance. If justice perceptions influence employee trust, which the results of the current study suggest, and trust influences performance as suggested by Chenhall and Langfield-Smith (2003), then justice perceptions may affect performance via trust.
20
FRANCES A. KENNEDY ET AL.
NOTES 1. 5S is a visual management technique that relies on a process of sort, shine, set in order, standardize, and sustain (Galsworth, 1997). 2. A plot of residuals was examined for heteroskedasticity and determined that the data was normal. 3. As of 2008, the gain-sharing plan was still in effect. The company reported a sixfigure net income for the first time in 2007. Also in this year, which is 3 years since the beginning of the gain-sharing plan, employees earned a maximum payout in 2 months.
ACKNOWLEDGMENTS We gratefully appreciate the time and assistance provided by the company participants. We thank Sally Widener for her insightful comments. In addition, the authors thank the editor, the associate editor, and the two anonymous reviewers for their constructive comments.
REFERENCES Adams, J. S. (1965). Injustice in social exchange. In: L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 2). New York: Academic Press. Alexander, S., & Ruderman, M. (1987). The role of procedural and distributive justice in organizational behavior. Social Justice Research, 1(2), 177–198. Ambrose, M., & Schminke, M. (2003). Organization structure as a moderator of the relationship between procedural justice, interactional justice, perceived organizational support, and supervisory trust. Journal of Applied Psychology, 88(2), 295–305. Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103, 411–423. Ansari, S. (1977). An integrated approach to control system design. Accounting, Organizations and Society, 2(2), 101–112. Aryee, S., Budhwar, P., & Chen, Z. (2002). Trust as a mediator of the relationship between organizational justice and work outcomes: Test of a social exchange model. Journal of Organizational Behavior, 23, 267–285. Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51(6), 1173–1182. Bigley, G., & Pearce, J. (1998). Straining for shared meaning in organization science: Problems of trust and distrust. Academy of Management Review, 23(3), 405–421. Birnberg, J., & Snodgrass, C. (1988). Culture and control: A field study. Accounting, Organizations and Society, 13(5), 447–464. Bowie-McCoy, S., Wendt, A., & Chope, R. (1993). Gainsharing in public accounting: Working smarter and harder. Industrial Relations, 32(3), 432–445.
Roles of Organizational Justice and Trust in a Gain-Sharing
21
Brockner, J., & Wiesenfeld, B. M. (1996). An integrative framework for explaining reactions to decisions: The interactive effects of outcomes and procedures. Psychological Bulletin, 120, 189–208. Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminate validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105. Chenhall, R. (2003). Management control systems design within its organizational context: Findings from contingency-based research and directions for the future. Accounting, Organizations and Society, 28, 127–168. Chenhall, R., & Langfield-Smith, K. (2003). Performance measurement and reward systems, trust and strategic change. Journal of Management Accounting Research, 15, 117–143. Churchill, G. A. (1979). A paradigm for developing better measures of marketing constructs. Journal of Marketing Research, XVI, 64–73. Cohen-Charash, Y., & Spector, P. (2001). The role of justice in organizations: A meta-analysis. Organizational Behavior and Human Decision Processes, 86(2), 278–321. Colquitt, J., Conlon, D., Wesson, M., Porter, C., & Ng, K. (2001). Justice at the millennium: A meta-analytic review of 25 years of organizational justice research. Journal of Applied Psychology, 86(3), 425–445. Cook, J., & Wall, T. (1980). New work attitude measures of trust, organizational commitment and personal need non-fulfillment. Journal of Occupational Psychology, 53, 39–52. Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334. Cropanzano, R., Prehar, C., & Chen, P. (2002). Using social exchange theory to distinguish procedural from interactional justice. Group & Organization Management, 27(3), 324–351. DeVellis, R. (1991). Scale development. Newbury Park, NJ: Sage Publications. Dirks, K., & Ferrin, D. (2002). Trust in leadership: Meta-analytic findings and implications for research and practice. Journal of Applied Psychology, 87(4), 611–628. Ferrin, D., & Dirks, K. (2003). The use of rewards to increase and decrease trust: Mediating processes and differential effects. Organization Science, 14(1), 18–31. Fisher, J. (1998). Contingency theory, management control systems and firm outcomes: Past results and future directions. Behavioral Research in Accounting, 10(Suppl.), 47–65. Fisher, J., Frederickson, J., & Peffer, S. (2002). The effect of information asymmetry on negotiated budgets: An empirical investigation. Accounting, Organizations and Society, 27, 27–43. Flamholtz, E. (1979). Behavioral aspects of accounting/control systems. In: S. Kerr (Ed.), Organizational behavior (pp. 289–316). Columbus, OH: Grid Publishing. Flamholtz, E. (1983). Accounting, budgeting and control systems in their organizational context: Theoretical and empirical perspectives. Accounting, Organizations and Society, 8(2/3), 153–169. Flamholtz, E., Das, T., & Tsui, A. (1985). Toward an integrative framework of organizational control. Accounting, Organizations and Society, 10(1), 35–50. Folger, R., & Cropanzano, R. (1998). Organizational justice and human resource management. Thousand Oaks, CA: Sage Publications. Folger, R., & Konovsky, M. (1989). Effects of procedural and distributive justice on reactions to pay raise decisions. Academy of Management Journal, 32(1), 115–130. Galsworth, G. D. (1997). Visual systems: Harnessing the power of the visual workplace. New York, NY: AMACOM.
22
FRANCES A. KENNEDY ET AL.
Gomez-Mejia, L. (2000). The role of risk sharing and risk taking under gainsharing. Academy of Management Review, 25(3), 492–508. Hofstede, G. (1968). The game of budget control. London, UK: Tavistock. Hu, L. T., & Bentler, P. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6, 1–55. Johnson, E., Lowe, D., & Reckers, P. (2008). Alternative work arrangements and perceived career success: Current evidence from the big four firms in the US. Accounting, Organizations and Society, 33, 48–72. Kline, R. (2005). Principles and practice of structural equation modeling. New York, NY: Guilford Press. Korsgaard, M., & Roberson, L. (1995). Procedural justice in performance evaluation: The role of instrumental and non-instrumental voice in performance appraisal discussions. Journal of Management, 21(4), 657–669. Langfield-Smith, K. (1997). Management control systems and strategy: A critical review. Accounting, Organizations and Society, 22(2), 207–232. Lau, C., & Lim, E. (2002a). The effects of procedural justice and evaluative styles on the relationship between budget participation and performance. Advances in Accounting, 19, 139–160. Lau, C., & Lim, E. (2002b). The intervening effects of participation on the relationship between procedural justice and managerial performance. British Accounting Review, 34, 55–78. Libby, T. (1999). The influence of voice and explanation on performance in a participative budget setting. Accounting, Organizations and Society, 24, 125–137. Libby, T. (2003). The effect of fairness in contracting on the creation of budgetary slack. Advances in Accounting Behavioral Accounting, 6, 145–169. Lindquist, T. (1995). Fairness as an antecedent to participative budgeting: Examining the effects of distributive justice, procedural justice and referent cognitions on satisfaction and performance. Journal of Management Research, 7, 122–147. Maiga, A. (2006). Fairness, budget satisfaction, and budget performance: A path analytic model of their relationships. Advances in Accounting Behavioral Research, 9, 87–111. Mangel, R., & Useem, M. (2000). The strategic role of gainsharing. Journal of Labor Research, 21(2), 327–344. Merchant, K. (1985). Control in business organizations. Marshfield, MA: Pitman Publishing. Otley, D. (1994). Management control in contemporary organizations: Toward a wider framework. Management Accounting Research, 5, 289–299. Parker, R., & Kohlmeyer, J. (2005). Organizational justice and turnover in public accounting firms: A research note. Accounting, Organizations and Society, 30, 357–369. Podsakoff, P. M., & Organ, D. W. (1986). Self-reports in organizational research: Problems and prospects. Journal of Management, 12(4), 531–544. Rosin, H. M., & Korabik, K. (1991). Workplace variables, affective responses, and intentions to leave among women managers. Journal of Occupational Psychology, 64, 317–330. Ross, A. (1994). Trust as a moderator of the effect of performance evaluation style on jobrelated tension: A research note. Accounting, Organizations and Society, 19(7), 629–635. Staley, A., Dastoor, B., Magner, N., & Stolp, C. (2003). The contribution of organizational justice in budget decision-making to federal managers’ organizational commitment. Journal of Public Budgeting, Accounting & Financial Management, 15, 505–524.
Roles of Organizational Justice and Trust in a Gain-Sharing
23
Staley, A., & Magner, N. (2007). Budgetary fairness, supervisory trust, and the propensity to create budgetary slack: Testing a social exchange model in a government budgeting context. Advances in Accounting Behavioral Research, 10, 159–182. Sweeney, P. D., & McFarlin, D. (1993). Workers’ evaluations of the ‘‘ends’’ and the ‘‘means’’: An examination of four models of distributive and procedural justice. Organizational Behavior and Human Decision Processes, 55, 23–40. Welbourne, T. (1998). Untangling procedural and distributive justice: Their relative effects on gainsharing satisfaction. Group & Organization Management, 23(4), 325–346. Welbourne, T., Balkin, D., & Gomez-Mejia, L. (1995). Gainsharing and mutual monitoring: A combined agency-organizational justice interpretation. Academy of Management Journal, 38(3), 881–899. Welbourne, T., & Gomez-Mejia, L. (1995). Gainsharing: A critical review and a future research agenda. Journal of Management, 21(3), 559–609. Wentzel, K. (2002). The influence of fairness perceptions and goal commitment on managers’ performance in a budget setting. Behavioral Research in Accounting, 14, 247–271.
EARLY DETECTION OF FRAUD: EVIDENCE FROM RESTATEMENTS Natalie Tatiana Churyk, Chih-Chen Lee and B. Douglas Clinton ABSTRACT Researchers are continually trying to find reliable fraud indicators (e.g., Beasley, 1996) and some are working on building fraud prediction models (e.g., Spathis, 2002) to aid auditors in fraud detection. With this same goal of predicting fraud in mind, the purpose of this study is to explore the potential of qualitative fraud risk indicators. Content analysis is used in analyzing the Management’s Discussion and Analysis (MDA) section of the annual report to identify potential indicators of deception to increase the likelihood of fraud detection in a timelier manner than current quantitative models. By examining asynchronous communication contained in annual reports for companies required by the SEC to restate their financial statements, patterns of key linguistic characteristics were identified and compared to those used by companies not required to restate. Findings evidence significant differences on several dimensions. Using language cues for detection of deception has the advantage over quantitative methods of providing a more timely method of determining deception. Quantitative models often cannot detect deception until the effects are validated by financial impairment.
Advances in Accounting Behavioral Research, Volume 12, 25–40 Copyright r 2009 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1108/S1475-1488(2009)0000012004
25
26
NATALIE TATIANA CHURYK ET AL.
Implications of the findings suggest that qualitative methods of deception detection may provide an earlier, and thus more useful, method of the detection of fraud. The results of this study should provide stakeholders with a set of indicators to aid in identifying misstated information. This approach is also one that can be generalized to other written documents used to predict fraudulent communication.
INTRODUCTION The incidence of fraudulent financial reporting has increased over the years as indicated by the increase in the number of Accounting and Auditing Enforcement Releases (AAERs) filed by the SEC (www.sec.gov). The number of public companies announcing financial restatements from 2002 through September 2005 rose from 3.7% to 6.8% while total restatement announcements grew by approximately 67% over this period (www.gao. gov). In addition, the average direct financial loss from fraud rose by almost 40% from $1.7 million in 2005 to $2.4 million in 2007 (PWC, 2007). As indicated by this evidence, fraudulent financial reporting is clearly an important problem and impacts many parties, including investors, creditors, auditors, and other company stakeholders. Palmrose, Richardson, and Scholz (2004) confirm that the market reacts negatively to the announcement of fraud. In addition, auditors may be held liable for failing to detect material misstatements in financial reports (Bonner, Palmrose, & Young, 1998). Professional standard setting bodies (e.g., AICPA) and other sponsored organizations (e.g., COSO) have issued standards and reports to aid auditors in detecting various types of fraud. For instance, SAS No. 99, issued in 2002, provides auditors with a list of risk factors categorized into three areas: incentives/pressure, opportunities, and attitude/rationalization. The presence of these risk factors does not always indicate the existence of fraud, but these factors exist when fraud is present. When encountering these risk factors, auditors should elevate their professional skepticism and determine whether more extensive testing is necessary. Per SAS No. 99, auditors are required to assess the likelihood of fraudulent financial reporting during the audit process (AICPA, 2002). Both SAS No. 99 and prior studies (e.g., Pincus, 1989) provide lists of fraud risk factors (red flags) that would indicate the need for auditors to raise the assessed fraud risk and consider extending audit procedures.
Early Detection of Fraud
27
Researchers have made many attempts to identify fraud indicators and to build fraud prediction models. For example, Beasley (1996) reports that a low proportion of outside members on the board of directors is a good fraud indicator. Dechow, Sloan, and Sweeny (1996) found similar results. They also found that SEC AAER issuances are associated with a lack of audit committees. Some researchers have turned to quantitative data on financial statements in detecting fraud. For example, Spathis (2002) developed a model able to correctly classify fraudulent financial statements 84% of the time. Unfortunately, by the time the data shows up in the financial statements, the damage is often severe and irreversible. Thus, reliability is not always the most important indicator of the most successful fraud prediction model. Another issue is that prior detection methods have involved considerable subjectivity and auditor judgment. There are generally two problems here. First, while relying on the professional skepticism of auditors is advised, the process is often very error prone (Frank & Feeley, 2003). Second, the process of following through with judgment-based investigation is often time consuming. Both of these disadvantages suggest the increasing need to find automated mechanisms to detect fraud. Contexts in which automated deception detection has been investigated include credit card fraud (Wheeler & Aitken, 2000), telecommunication fraud (Fawcett & Provost, 1997), and network intrusion (Mukherjee, Heberlein, & Levitt, 1994). In these selected contexts, automated methods of fraud detection are extended to the quantitative data approaches that provide ex post fraud detection. Unfortunately, although these mechanisms are useful for the contexts mentioned, they are also untimely from the view of preventing the fraudulent activity through early mechanisms of detection. In addition, deception detection based on quantitative data points only captures activity related to quantitative results, potentially omitting attitudes and intentions that might eventually be manifested as impactful fraudulent events. Defalcation generally starts on a small scale and increases over time. Thus, to be more effective at fraud prevention, methods of earlier fraud detection must be employed. This evidence suggests a context for the use of content analysis. Documents containing management’s perspectives and intentions can be found in the annual report. Indeed, the Management Discussion and Analysis (MDA) section of the annual report is intended for this purpose and provides a natural motivation for its use in testing for this study. Unfortunately, content analysis is not an exact science; and as applied to text-based, asynchronous media, the analysis is complicated by complex and
28
NATALIE TATIANA CHURYK ET AL.
ambiguous language embedded in a context of non-standardized composition styles (Zhou et al., 2004b). However, researchers have refined the approach over time such that improvements have suggested the usefulness of the method in the area of early deception detection. Language-based cues employed in automated deception detection software have increased in discriminatory potential while simultaneously decreasing their dependence on context (Zhou et al., 2004b). However, distinct differences in the directional expectations of the application of some language-based cues (e.g. number of words used) to differing contexts (e.g. synchronous vs. asynchronous) still remain.
THEORY AND HYPOTHESES DEVELOPMENT ‘‘Content analysis has been defined as a systematic, replicable technique for compressing many words of text into fewer content categories based on explicit rules of coding’’ (Stemler, 2001, p. 1). It has been used to analyze textual information in annual reports as well as to examine stock analyst reports, the content of president’s letters, and narrative disclosures as they relate to bankruptcy, good news versus bad news, and deceit (Previts, Bricker, & Robinson, 1994; Rogers & Grant, 1997; and Breton & Taffler, 2001). Content analysis has also been used to examine concealment of negative information using either the president’s letter or chairperson’s statement. Abrahamson and Park (1994) examined 1,118 president’s letters to determine if corporate officers hide negative outcomes in communications with stockholders intentionally (i.e., as evidenced by short selling). They found that some environments led to intentional concealment and that disclosure of negative outcomes was more likely when financial performance was poor. When Abrahamson and Amir (1996) used content analysis to examine the president’s letter and to estimate a measure of negativity, they found that investors used the information to assess the quality of earnings. Smith and Taffler (2000) found that the information content in the chairperson’s statement was associated with firm failure. Tennyson, Ingram, and Dugan (1990) not only analyzed the chairperson’s statement with content analysis, but also examined the MDA to study firm failure. Using logistic regression, they found that the content in the president’s letter could be used to correctly classify bankruptcies 76% of the time and that the content in the MDA can be used to correctly classify bankruptcies 68% of the time.
Early Detection of Fraud
29
Most germane to the current study, content analysis has also been used to identify deceit in direct personal communication (e.g., Zhou, Burgoon, Twitchell, Qin, & Nunamaker, 2004a; Lee, Welker, & Odom, 2005). Zhou et al. (2004a) collected information using content analysis on truthful and deceptive asynchronous computer-mediated communication (i.e., E-mail) and found significant differences between truthful and deceptive written communication. Although this differs from the analysis of the MDA in annual reports, text-based asynchronous communication is the common thread across Zhou et al. (2004a, 2004b, 2004c), and is used in the current study as well. Many prior studies using content analysis to detect deception in communication have focused on rich media channels such as face-to-face communication (e.g. see Kraut, 1978; Landry & Brigham, 1992; Porter & Yuille, 1996). In contrast, asynchronous communication allows the communicators to respond at their convenience rather than with an immediate response. Zhou et al. (2004a, 2004b, 2004c) and Hancock, Curry, Goorha, and Woodworth (2005) provide recent departures with the investigation of asynchronous, text-based communication. The results provide evidence that the asynchronous, text-based environment responds differently from the richer forms of synchronous communication and suggest valuable theory to deciphering language-based cues specific to this different medium and context. One example is that deceivers in rich, synchronous settings tend to use fewer words when attempting deception; alternatively, more words are used in attempting to deceive in an asynchronous environment. This is apparently because deceivers have more time to be thoughtful; and given the incentive to be persuasive, deceivers will use more words to attempt to be more convincing. In comparison, crafting a deceptive but highly believable story is much more difficult to accomplish with synchronous communication which requires an immediate response. Thus, the deceiver’s strategy is likely to be different. ‘‘ . . . [A]synchronicity enables greater control and forethought, greater time for deceivers to plan, rehearse and edit what they say (Zhou et al., 2004b, p. 90).’’ Findings reported by Zhou et al. (2004b), Burgoon (2005), and Dunbar, Ramirez, and Burgoon (2003) support this expectation. Thus, restated MDAs should use more words in attempting deceptive communication, providing part one of our hypothesis. Zhou et al. (2004a) and similar studies (e.g. Zhou et al., 2004b) have expected and/or evidenced lower lexical diversity (i.e. unique words) and complexity for deceivers than truth tellers. This issue is distinct from the number of total words used since deceivers would be expected to use less unique words in lieu of focusing on their overarching point of achieving their
30
NATALIE TATIANA CHURYK ET AL.
primary deception (i.e., part 2 of our hypothesis). Keeping the story relatively simple by failing to reflect the rich diversity of actual events and using less-descriptive terms may serve to obfuscate and provide an increased level of ambiguity (i.e. reduced content specificity) that deceivers desire while still striving to be convincing via a longer message. Thus, we would argue that increased word usage in attempts at deception through asynchronous communication remains consistent with the expectation of reduced lexical (vocabulary) and content diversity as supported by Zhou et al. (2004b). Part two of our hypothesis reflects this expectation. The reduction in organizational clarity in communication is a common strategy used by deceivers that echoes the argument for reduced lexical diversity and other language-based cues such as the provision of examples that might otherwise provide increased clarity of the message. This vagueness and uncertainty is typically exhibited by less punctuation in general and a decreased amount of punctuation of the type that would otherwise be used in making points more explicit. That is, colons preceding examples or semicolons preceding a sentence fragment used as clarification. Presence of words or terms related to examples (‘‘for example’’ or ‘‘such as’’) are also expected to be used less frequently by deceivers (Zhou et al., 2004a). Thus, we will test this effect with the expectation of fewer colons, less semicolons, and less language suggesting clarifying examples (i.e., parts 3–5 of our multipart hypothesis). Zhou et al. (2004a, 2004b) provided results confirming that deceivers evidence negative emotions if any. From these results, we also proposed that anxiety would be higher for deceivers than for truth tellers. Zhou et al. (2004b) explains that nonstrategic (i.e., unintended) cues such as nervousness, arousal, and tension are expected from deceivers. Thus, parts 6–8 of our hypothesis express a lower expectation for positive emotion, optimism, and energy, and an expectation of a higher level of anxiety. According to Zhou et al. (2004b), deceivers attempt to disassociate themselves from a deceptive message through nonimmediacy. This is consistent with a general strategy of obfuscation and equivocation. Related to this, nonimmediacy is overlapping to some degree with the form of a lower expectation of measurable levels of cognition associated with causation (part 9 of our hypothesis) as the deceiver attempts to divorce him/herself from the message. We would also expect lower levels of the certainty construct from deceivers as supported by Zhou et al. (2004a, 2004b). Based on the recommendation of Lee, Welker, and Odom (forthcoming), Wray and Lewis (2005), and Georgalis (2006) present tense verbs are also included in our hypothesis.
31
Early Detection of Fraud
This suggests the formal statement of a multipart central hypothesis (alternate form): Hypothesis. The MDA section of the annual report for companies that are required by the SEC (per AAERs) to restate their financial statements will, on average, contain (1) more words; (2) less unique words; (3) less colons; (4) less semicolons; (5) less use of the term ‘‘for example’’; (6) less terms indicating positive emotion; (7) less terms reflecting optimism and energy; (8) greater terms indicating anxiety; (9) less terms indicating causation; and (10) less present tense verbs, from those of similar companies that are not required by the SEC to restate their financial statements. This hypothesis reflects the historical priors of studies regarding the general tendencies of companies to use certain patterns of linguistic characteristics when rendering asynchronous communication in the context of the MDA section of fundamental annual reporting.
METHOD Using an archival approach, the MDA section of annual reports was examined for companies required by the SEC to restate their financial statements relating to the time frame of 1989–2001 with AAERs issued from 2000 to 2003 comparing them to similar companies where restatement was not required. The purpose of the comparison was to determine whether automated content analysis software using our hypothesized language-based cues could distinguish between deceptive communication and that of truth tellers based on a company’s MDA. Content analysis was performed on both sets of annual reports using the Linguistic Inquiry Word Count Program 2001 (LIWC, 2001). LIWC parses and identifies parts of speech, and also identifies syntax. It then analyzes the frequencies of the occurrences of language-based cues. The multipart hypothesis was tested by examining these relations as determined by the predefined LIWC 2001 linguistic software dimensions. Table 1 provides a summary of linguistic constructs and variable measures used as previously described. As displayed in Table 1, with the exception of the total words variable, all hypothesized variables were standardized by dividing by the total number of words multiplied by 100.
32
NATALIE TATIANA CHURYK ET AL.
Table 1. Summaries of Linguistic Constructs and Variable Measures. Standard Linguistic Variables 1. Total wordsa – count of a written character or combination of characters representing a written word 2. Lexical diversitya – total number of different words or terms/total number of words or terms 100, which is the percentage of unique words or terms in all words or terms 100 Organizational Clarity 3. Colonsa – count of colons/total number of words or terms 100 4. Semicolonsa – count of semicolons/total number of words or terms 100 5. For exampleb – count of the term ‘‘for example’’/total number of words or terms 100 Affective or Emotional Processes 6. Positive emotionb – total number of words or terms indicating positive emotion/total number of words or terms 100; examples include ‘‘happy, pretty, good’’ 7. Optimism and energyb – total number of words or terms indicating optimism/total number of words or terms 100; examples include ‘‘certainty, pride, win’’ 8. Anxietyb,c – total number of words or terms indicating anxiety/total number of words or terms 100; examples include ‘‘nervous, afraid, tense’’ Cognitive 9. Causationb,c – total number of words or terms indicating causation/total number of words or terms 100; examples include ‘‘because, effect, and hence’’ Certainty 10. Present tenseb – total number of present tense verbs/total number of words or terms 100; examples include ‘‘walk, is, be’’ a
As adapted from Zhou et al. (2004b). Output variable information and classification as identified with LIWC (2001). c See Hancock et al. (2005). b
Sample The sample was comprised of companies that have been issued an AAER during the time period 2000–2003 related to misstated earnings during 1989–2001. We also identified a control sample of matched companies that were not required to restate but that were similar to the object sample companies on the dimensions of same industry and similar size.1 As reported in Table 2, we identified 311 firms for which AAERs were issued in the years 2000–2003, of which 66 firms contained AAERs for multiple years reducing the number of firms to 245. This number was further reduced by 152 for the number of firms for which financial statements or information about prior revenues were unavailable, bringing the number to 93. This was further reduced by 25 for firms for which no suitable matching data was obtainable
33
Early Detection of Fraud
Table 2.
Sample Selection.
SEC Accounting and Enforcement Releases (AAER’s) 2000–2003 Firms receiving SEC AAER’s (revenue related) Duplicate firms Subtotal (unique individual firms) Firms without financials or firms without identifiable prior year revenue Potential sample firms Firms for which no match was found (match amended, etc.) Final individual firm count
311 66 245 152 93 25 68
Final sample including multiple years for restated firms (i.e. firm years)
118
Table 3.
Sample Industry Composition.
SIC Code
SIC Description
Frequency
Percent
1000–1999 2000–2999
Mining and construction Manufacturing – food, tobacco, textile, apparel, lumber, furniture, paper, printing, chemicals, and refining Manufacturing – rubber, leather, stone, metal, machinery, electronic, transportation, controlling instruments, miscellaneous Transportation, communications, electric, gas, and sanitary Retail trade Finance, insurance, and real estate Services – hotels, personal, business, automotive repair, motion picture, amusement Services – health, legal, educational, social, museums, membership, accounting, engineering, research Public Administration
2 10
3 15
22
32
3
5
5 6 17
7 9 25
2
3
3000–3999
4000–4999 5000–5999 6000–6999 7000–7999 8000–8999 9000–9999 Total
1
1
68
100
reducing the uniquely identifiable number of firms available for study to 68. For these 68 firms, 26 contained multiple year restatements providing a final sample of ‘‘firm years’’ available for data comparison purposes of 118. Table 3 provides a breakdown of the composition of sample firms by the SIC category to allow for increased information to aid in the generalization of results to industry context. The table reveals that basic manufacturing (2000–3999; 32%) and service (17%) industries provide for the highest concentration of companies (49%) in the sample suggesting that the sample is not systemically tied to any particular industry category.
34
NATALIE TATIANA CHURYK ET AL.
Procedure When the SEC formally charges a company with a violation, an AAER is pending. The AAER restatements are actually mandated by the SEC where they have successfully required the company to make the restatements through formal legal proceedings. Thus, the companies’ annual reports that were originally filed but later required to be restated were viewed as a proxy for deceptive communication in the context of an asynchronous, text-based setting. Content analysis was used to examine the narrative portion of both sets of annual reports (deceptive and matched sample). The Linguistic Inquiry and Word Count 2001 software (LIWC, 2001) was used to conduct the content analysis. The content analysis method generally uses the computer program to parse narrative documents and identify parts of speech, syntax, and the like. Once that operation is performed, the frequencies of the occurrences are analyzed. Using the LIWC (2001) program, the characteristics of the MDA sections of companies required to restate financial results were compared to those of companies not required to restate. Frequency counts were provided for these characteristics, which depicted the total number of words and the other language-based cues (standardized by word count), comprising the 10 hypothesized variable relations presented earlier.
Analysis The basic descriptive and univariate results were examined, and pairwise comparisons of the sample means for the restated versus nonrestated company years were conducted. A one-tailed, directional t-test between groups was used to test the pairwise comparisons of the sample means of total word count and frequencies of the other standardized variables specified by the multipart hypothesis.
RESULTS Table 4 provides a summary of results that presents the 10 variables, directional hypothesis expectations, means for both the object, restated firms and those of the control, matched firms, along with t-statistics and accompanying p-values reflecting the pairwise comparison results and their
35
Early Detection of Fraud
Table 4.
Direction of Hypothesis
Meansa
1. Total words Restated Matched
W
5386 4553
1.74
0.04b
2. Lexical diversity Restated Matched
o
21.47 23.51
2.34
0.01c
3. Colons Restated Matched
o
0.10 0.14
2.73
0.003c
4. Semicolons Restated Matched
o
0.11 0.15
2.12
0.015b
5. For example Restated Matched
o
0.003 0.007
1.87
0.03b
6. Positive emotion Restated Matched
o
2.07 2.27
2.54
0.005c
7. Optimism and energy Restated o Matched
0.85 0.95
2.01
0.02b
8. Anxiety Restated Matched
W
0.02 0.04
2.37
0.01
9. Causation Restated Matched
o
1.42 1.52
1.66
0.045b
o
2.72 2.93
2.52
0.005c
Variables
10. Present tense Restated Matched a
Hypothesized Relationships and Results. t-Statistics
1-Tailed p-Values
Since means are standardized by word count and then multiplied by 100, actual frequencies of variables are higher than the mean values shown here. For example, if one MDA has a word count of 10,000, then the colon mean of 0.10 indicates the presence of 10 colons in the MDA. b Hypothesis confirmed at the 0.05 level (statistically significant and directionally correct). c Hypothesis confirmed at the 0.01 level (statistically significant and directionally correct).
36
NATALIE TATIANA CHURYK ET AL.
significance. Hypothesized values were generally expected to be lower for the variables tested in the study. The only exceptions to this were the expectation of a higher number of total words and a higher degree of anxiety for deceivers. In summary, nine of the 10 variable relationships were confirmed as hypothesized. The most significant relationships ( po0.01) included lower lexical diversity, lower number of colons, lower positive emotion, and a lower use of present tense verbs for restating companies. In addition, relationships at a conventional ( po0.05) level of significance included a greater amount of total words, a lower amount of semicolons, a lower frequency of ‘‘for example,’’ a lower amount of optimism and energy, and a lower frequency of present tense verb usage. Hypothesized results that were not confirmed include only the expectation of a greater amount of anxiety. Overall, these results are supportive of our multipart hypothesis.
DISCUSSION The purpose of this study was to determine if there are important qualitative differences inherent in the language characteristics of MDA sections of the annual report between companies required by the SEC to restate their financial statements and companies not required to restate. Accordingly, we confirmed nine of ten hypothesized relationships in this regard. These relationships generally describe expected results regarding the differences between the language-based cues related to deceptive communication as compared to the communication of truth tellers. A broader objective that is encouraged by these results is the implication of the ability to detect fraudulent activity in the form of deceptive communication. If this expectation is confirmed by additional replication and triangulation, then we may have evidenced a valid method of determining the likelihood of fraudulent activity that is timelier, and hence more useful than that of using financial metrics. Language-based cues have been used in various contexts to determine whether communication is deceptive or truthful. Accordingly, if these cues can be validated as sufficiently reliable, then determination of fraudulent activity may be aided by the use of these automated methods of discovery. The big payoff here is an earlier determination of fraudulent activity. That is, auditors have in the past used financial metrics to determine the likelihood of financial fraud. However, by the time the fraudulent affects
37
Early Detection of Fraud
show up in the financial data, the damage to the firm and various stakeholders has already been done. Using language-based cues could provide early detection (or even intention to deceive) thus providing a much more useful form of safeguarding stakeholders. The one relationship that was not confirmed in the current study is that of an expectation of a higher level of anxiety as associated with asynchronous communication such as with the MDA section of an annual report. However, this unexpected result is likely unique to asynchronous communication. As explained by Zhou et al. (2004b), Unlike interviews, in which respondents must construct answers spontaneously in real time, with little opportunity for prior planning, rehearsal, or editing, deceivers in this investigation had ample opportunity to create and revise their messages so as to make them as persuasive as possible (99). [Also] This can reduce the cognitive difficulty of the task as well as the anxiety associated with answering ‘‘on the fly’’ (90).
Accordingly, we believe that although negative emotion is likely associated with deceptive communication, the presence of anxiety may be ameliorated by the unique feature of increased time and the opportunity for advanced planning and for revision that is afforded through the mechanism of asynchronous communication.
FUTURE RESEARCH Building on the current results suggests the comparison of fraud determination models based on financial metrics versus models based on qualitative language-based cues. One method of determining the incremental value of qualitative language-based models would be to compare the incremental explanatory power of a financial metric-based model using reliable, frauddetermining characteristics to a model based on language-based cues. The importance of such a study would be related to determining the incremental value of employing language-based models to what is already available in terms of fraud detection. Models involving language-based cues would initially already have the advantage since the success of the fraudulent activity would not be necessary to validate the detection method. If language-based methods were as effective as financial metrics, they would provide a much-preferred method of early indication of deception prior to the validation of fraudulent results.
38
NATALIE TATIANA CHURYK ET AL.
CONCLUSIONS We realize that there are challenges with the practical implications of generalizing the results of this study to the point of implementation or to the development of a practical fraud detection model. However, we are taking a long-term view toward the eventual implementation of a model using this approach. We believe that the ability to provide early detection is the attractive feature of the approach that we have attempted to highlight. Our hope is that additional research will help in this regard. Nevertheless, regardless of the ultimate impact on fraud, the fundamental premise of early detection of deceptive communication has value in its own right and is of interest to many parties. However, to benefit the actual fraud detection process, much development has yet to take place. As a mechanism for determination of fraudulent activity, the language-based cues inherent in employing content analysis to verify deception could increase the ability of auditors and other stakeholders to identify potential indicators of fraud far earlier than financial-metric based mechanisms.
NOTE 1. The norm for matching companies is by industry and size. While we matched size based on revenue, size for similar studies is often based on either total assets or total revenue. Since the incidence of restatement itself could have affected revenue size, we conducted the matching based on a nonrestated year.
REFERENCES Abrahamson, E., & Park, C. (1994). Concealment of negative organizational outcomes: An agency theory perspective. Academy of Management Journal, 37(5), 1302–1334. Abrahamson, E., & Amir, E. (1996). The information content of the president’s letter to shareholders. Journal of Business Finance and Accounting, 23(8), 1157–1182. American Institute of Certified Public Accountants (AICPA). (2002). Consideration of Fraud in a Financial Statement Audit. Statement on Auditing Standards No. 99. New York, NY: AICPA. Beasley, M. S. (1996). An empirical analysis of the relation between the board of director composition and financial statement fraud. The Accounting Review, 71(4), 443–465. Bonner, S. E., Palmrose, Z., & Young, S. M. (1998). Fraud type and auditor litigation: An analysis of SEC accounting and enforcement releases. The Accounting Review, 73(4), 503–532.
Early Detection of Fraud
39
Breton, G., & Taffler, R. J. (2001). Accounting information and analyst stock recommendation decisions: A content analysis approach. Accounting and Business Research, 31(2), 91–101. Burgoon, J. K. (2005). The future of motivated deception and its detection. Communication Yearbook, 29, 49–95. Dechow, P., Sloan, R. G., & Sweeny, A. P. (1996). Causes and consequences of earnings manipulation: An analysis of firms subject to enforcement action by the SEC. Contemporary Accounting Research, 13(1), 1–36. Dunbar, N. E., Ramirez, A., Jr., & Burgoon, J. K. (2003). The effects of participation on the ability to judge deceipt. Communication Reports, 16(1), 23–33. Fawcett, T., & Provost, F. (1997). Adaptive fraud detection. Data Mining and Knowledge Discovery Journal, 1, 291–316. Frank, M. G., & Feeley, T. H. (2003). To catch a liar: Challenges for research in lie detection training. Journal of Applied Communication Research, 31(1), 58–75. Georgalis, N. (2006). The primacy of the subjective: Foundations for a unified theory of mind and language. Cambridge, MA: MIT Press. Hancock, J. T., Curry, L., Goorha, S., & Woodworth, M. (2005). Automated linguistic analysis of deceptive and truthful synchronous computer-mediated communication. Proceedings of the 38th Hawaii International Conference on System Sciences (pp. 1–10). Kraut, R. E. (1978). Verbal and nonverbal cues in the perception of lying. Journal of Personality and Social Psychology, 36, 380–391. Landry, K., & Brigham, J. C. (1992). The effect of training in criteria-based content analysis on the ability to detect deception in adults. Law and Human Behavior, 16, 663–675. Lee, C., Welker, R. B., & Odom, M. D. (2005). Credibility-enhancing displays as a source of cues for the detection in text-based, asynchronous, computer-mediated communication. Working paper. Lee, C., Welker, R. B., & Odom, M. D. Features of computer-mediated, text-based messages that support automatable, linguistics-based indicators for deception detection (forthcoming in Journal of Information Systems). Mukherjee, B., Heberlein, L. T., & Levitt, K. N. (1994). Network intrusion detection. IEEE Network, 8, 26–41. Palmrose, Z., Richardson, V. J., & Scholz, S. (2004). Determinants of market reactions to restatement announcements. Journal of Accounting and Economics, 37, 59–89. Pincus, K. V. (1989). The efficacy of a red flags questionnaire for assessing the possibility of fraud. Accounting, Organizations and Society, 14(1/2), 153–163. Porter, S., & Yuille, J. C. (1996). The language of deceit: An investigation of the verbal clues to deception in the interrogation context. Law and Human Behavior, 20, 443–458. Previts, G. J., Bricker, R. J., & Robinson, T. R. (1994). A content analysis of sell-side financial analyst company reports. Accounting Horizons, 8(2), 55–70. PricewaterhouseCoopers (PWC). (2007). Economic crime: People, culture and controls. PWC Global Economic Crime Survey. Martin Luther University, Economy and Crime Research Center: Dr. Kai-D. Bussmann. Rogers, R. K., & Grant, J. (1997). Content analysis of information cited in reports of sell-side financial analysts. Journal of Financial Statement Analysis, 3(1), 17–30. Smith, M., & Taffler, R. J. (2000). The chairman’s statement: A content analysis of discretionary narrative disclosures. Accounting, Auditing, and Accountability, 13(5), 624–646. Spathis, C. T. (2002). Detecting false financial statements using published data: Some evidence from Greece. Managerial Auditing Journal, 17(4), 179–191.
40
NATALIE TATIANA CHURYK ET AL.
Stemler, S. (2001). An introduction to content analysis. ERIC Digest. As cited by Berelson, 1952; GAO 1996; Krippendorff 1980; and Weber 1990. www.eric.ed.gov Tennyson, B. M., Ingram, R. W., & Dugan, M. T. (1990). Assessing the information content of narrative disclosures in explaining bankruptcy. Journal of Business, Finance, and Accounting, 17(3), 391–410. Wheeler, R., & Aitken, S. (2000). Multiple algorithms for fraud detection. Knowledge-Based Systems, 13, 93–99. Wray, D., & Lewis, M. (2005). An approach to factual writing. Reading online, International Reading Association, ISSN 1096-1232, accessed December 12, 2005, available at www.readingonline.org/articles/writing/persuad.htm Zhou, L., Burgoon, J. K., Twitchell, D. P., Qin, T., & Nunamaker, J. F. (2004a). A comparison of classification methods for predicting deception in computer-mediated communication. Journal of Management Information Systems, 20(4), 139–165. Zhou, L., Burgoon, J. K., Twitchell, D. P., Qin, T., Nunamaker, J. F., Nunamaker, J. F., Jr., & Twitchell, D. P. (2004b). Automating linguistics-based cues for detecting deception in text-based asynchronous computer-mediated communication. Group Decision and Negotiation, 13, 81–106. Zhou, L., Burgoon, J. K., Twitchell, D. P., Qin, T., Nunamaker, J. F., Zhang, D., & Nunamaker, J. F. (2004c). language dominance in interpersonal deception in computermediated communication. Computers in Human Behavior, 20(3), 381–402.
JUSTIFICATION AND SELF-REVIEW: MITIGATING IRRELEVANT AFFECT IN FRAUD JUDGMENTS Brad A. Schafer and Jennifer K. Schafer ABSTRACT Research in psychology and accounting suggest that affect (client likeability) toward a person can impact human judgment, resulting in more favorable treatment for likeable than dislikeable individuals. This study investigates whether two debiasing mechanisms, justification and self-review, mitigate the impact of affect (client likeability) on fraud risk assessments. Consistent with prior research on nonfraud audit judgments, this study finds that in absence of any debiasing mechanism, inexperienced auditors are susceptible to affect biases in fraud judgments. Extending prior research, we find justification is not sufficient to mitigate likeability, but self-review is an effective mechanism to mitigate the effect of client likeability in a fraud judgment task. Supplemental findings indicate that general accounting experience, in itself, does not mitigate client likeability; however, the effectiveness of the self-review mechanism extends to these participants.
Advances in Accounting Behavioral Research, Volume 12, 41–59 Copyright r 2009 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1108/S1475-1488(2009)0000012005
41
42
BRAD A. SCHAFER AND JENNIFER K. SCHAFER
INTRODUCTION This study investigates whether two debiasing mechanisms, justification and self-review, mitigate the impact of affect (client likeability1) on fraud risk assessments. In their review of judgment and decision-making research in auditing, Nelson and Tan (2005) note the importance and significant lack of research examining how auditors’ affect and emotion influence audit performance. At least three reasons contribute to the importance of affect research in auditing. First, research finds that auditing judgments are susceptible to unconscious biases due to the auditor–client relationship (e.g., client preference bias) even when financial incentives reward accuracy (e.g., Bazerman, Loewenstein, & Moore, 2002). Thus, understanding what biases are created by these relationships and how these biases can be mitigated is important. Second, less experienced auditors who have been found to be most susceptible to affect biases (e.g., Bhattacharjee & Moreno, 2002) have a key role in collecting and documenting evidence and making preliminary recommendations to advanced auditors about significant auditing judgments (Ashton & Kennedy, 2002; Ricchiute, 1999). Thus, auditing research should seek to provide practice insight into improving the judgments of these specific auditors (Bonner, 1999). Finally in the area of fraud judgment, auditing standards suggest that the fraud risk assessment is a process started in the planning phase of the audit, but it is to be revised throughout the audit engagement specifically based on the fieldwork findings (AICPA, 2002, SAS 99: paragraph 68). Subsequent judgments by audit partners will benefit if the work papers and preliminary judgments, which they rely on to make those judgments, are free of a systematic influence of client likeability. Psychology research (Regan, Straus, & Fazio, 1974) finds that a person’s likeability can influence unrelated judgments. Affect-as-information theory (Schwarz, 1990) predicts that emotion will be utilized in judgments when the emotion is perceived (even errantly) to be relevant criteria. Bhattacharjee and Moreno (2002) find that a client’s likeability, which should be an irrelevant factor in an inventory obsolescence judgment task, influences the judgments of inexperienced auditors. Because inexperienced auditors interact with the client while collecting and documenting audit evidence, and they are the most likely to be influenced by client likeability, this research focuses on methods that auditing practice can implement to counter the impact of this irrelevant emotion in the fraud assessments of these auditors. Debiasing mechanisms, such as justification and self-review, have been shown to mitigate some cognitive biases (e.g., Kennedy, 1993, 1995). While inducing more effortful processing via a justification requirement (e.g.,
Justification and Self-Review
43
pretask instruction that an explanation will be required after the judgment) has been shown to improve some judgments, this research proposes that self-review (e.g., pretask instruction to consider alternative directional explanations for each cue) rather than justification may better reduce the impact of client likeability on inexperienced auditors’ fraud judgments. Selfreview, also referred to as counter-explanation and ‘‘considering the opposite,’’ is a specific mitigating technique for data-related biases (Ashton & Kennedy, 2002; Koonce, 1992; Lord, Lepper, & Preston, 1984), and thus may be a more appropriate strategy for mitigating the effect of client likeability on fraud judgments. The results of this study indicates that, in a control condition (no debiasing strategy), inexperienced participants rate the likelihood of fraud higher for a dislikeable client than a likeable client, extending the results of prior research from an audit task of inventory obsolescence to a fraud risk judgment. Results also indicate that, while inducing effort through justification does not mitigate the effect of likeability, requiring participants to self-review effectively mitigates the influence of client likeability on fraud judgments. In addition, this study extends the literature on affective bias by examining whether years of experience or specialized audit knowledge can explain the lack of affective bias found in prior studies for experienced subjects’ judgments. We provide empirical evidence that, similar to inexperienced auditors, experienced accounting professionals who have a primary specialization other than auditing, do succumb to the affective bias in a fraud task. Additionally, experienced accounting professionals who lack specialized audit knowledge benefit from the debiasing technique of selfreview. This result can assist practice to the extent that senior professionals in nonaudit areas consult in the audit opinion process and/or are invited to provide input on fraud discussions as discussed in SAS 99. The chapter proceeds as follows: first, the chapter provides the theory and related hypotheses. Next the experimental method is discussed. Third, it provides the results of the main experiment and additional findings. Finally, the chapter concludes with a discussion of the study’s implications.
THEORY AND HYPOTHESES Objective evidence evaluation is paramount in the accounting profession (Kinney, 1999; Wright & Mock, 1985), and information received from the client may be one of the most pervasive types of audit evidence (Koonce, 1992). Less experienced auditors (staff) are the primary auditors engaged
44
BRAD A. SCHAFER AND JENNIFER K. SCHAFER
with clients, performing fieldwork audit procedures, and making preliminary judgments for advanced auditor review and judgment. As members of the audit team, inexperienced professionals are in a position to provide input into and ultimately influence a variety of audit judgments, including initial fraud likelihood judgments (Ricchiute, 1999). Ricchiute (1999) finds that biased work papers prepared by lower-level auditors influence audit partner judgments. The importance of fraud judgments is well documented with the auditing standards stating that ‘‘members of the audit team should discuss the potential for material misstatement due to fraud’’ (AICPA, 2002, SAS 99: paragraph 14), and fraud consideration is to continue throughout the audit specifically based on conditions ‘‘found during fieldwork . . . ’’ (AICPA, 2002, SAS 99: paragraph 68). Thus, accountants can benefit from research that identifies debiasing mechanisms to improve fraud judgment. Prior research indicates that inexperienced auditors include client likeability (an irrelevant factor) in an auditing judgment (Bhattacharjee & Moreno, 2002). Specifically, less experienced auditors rated a client’s inventory obsolescence risk to be higher when they were provided with information designed to elicit a negative affective reaction toward the client. Bhattacharjee and Moreno (2002) point to the affect-as-information theory (cf., Schwarz, 1990) as an explanation whereby individuals use their perceived affective reactions as relevant information in their judgments. The affective evaluation of others is automatic and immediate (Zajonc, 1980). That is, people immediately feel attraction or avoidance toward others. The practical application of this research to the auditing profession is that an accountant can encounter client-related affect (client likeability) upon meeting the client at the beginning of the audit process, which may impact work paper documentation and subsequent audit judgments. While attribution is defined as seeking a cause for an event (Hastie, 1984), attribution error is errantly attributing someone’s behavior to disposition qualities rather than the appropriate factor (Fiske & Taylor, 1991). Regan et al. (1974) find that people make assessments of a stranger’s likeability and attribute that likeability in judging the stranger’s performance in a skilled task. That is, the performance of a liked (disliked) target is errantly attributed to the target’s ability when the performance is good (bad), but attributed to external situational factors when the performance is bad (good). In a fraud judgment, the immediate emotional response toward the client (likeability in the absence of information pertaining to honesty, integrity, etc.) should not be a criterion for the judgment as it does not increase fraud risk.2 Schafer (2006) finds that experienced auditors do not succumb to
Justification and Self-Review
45
client likeability in fraud judgment, but he finds that less experienced auditors do include client likeability in fraud judgment. To establish a necessary condition for the research question of this study and consistent with prior research, the current study hypothesizes those accountants who encounter a dislikeable (likeable) client will rate the fraud risk likelihood higher (lower) for that client. H1. In the absence of any debiasing mechanism, participants who encounter a dislikeable (likeable) client will rate the fraud likelihood higher (lower) for that client. Although many potential debiasing mechanisms exist to correct human judgment, this research focuses on a framework that provides a solid theoretical application to irrelevant affect (client likeability) and provides professional practice two reasonable mechanisms to implement. These include justification and self-review. First the psychological- and accounting-based framework is described, followed by specific predictions related to the debiasing mechanisms for mitigating client likeability in inexperienced auditors’ fraud judgments. Kennedy (1995) differentiates judgment biases into effort- and datarelated biases. Effort biases are a result of judgment processes lacking the effort to cognitively include all (and only) relevant cues and properly weigh the correct cues (Kennedy, 1995). Kennedy (1993) demonstrates that instructing participants that a potential evaluation of their work will occur (inducing accountability) can stimulate cognitive effort to overcome order effects, an effort-based bias. However, Kennedy (1995) reports that databased biases (using poor data) such as incorrect inclusion or omission of data (or incorrect weighing) requires a different type of debiasing from effort-based mechanisms. The inclusion of likeability, an irrelevant cue, in judgments of fraud is, by definition, a data-related bias. Thus, an effortbased debiasing mechanism may not mitigate likeability. Debiasing human judgment often includes the broad construct of accountability (e.g., see Lerner & Tetlock, 1999 for a review). While a common definition of accountability is the requirement to justify one’s judgment to others, accountability is not a unitary phenomenon (Lerner & Tetlock, 1999). In fact four distinguishable accountability manipulations exist, including (1) mere presence of another, (2) identifiability (no anonymity), (3) evaluation, and (4) reason giving (Gibbins & Newton, 1994; Lerner & Tetlock, 1999; Tetlock, 1983). This study focuses on reason giving, whereby participants expect they must give reasons for what they do.
46
BRAD A. SCHAFER AND JENNIFER K. SCHAFER
Justification is one type of mechanism that has been found to mitigate biases such as recency and primacy (Kennedy, 1993; Tetlock, 1983). In this study, justification is defined as an instruction to participants that they will be asked to give reasons for their judgment (Lerner & Tetlock, 1999). This definition follows that of Ashton (1990), who required participants to document their justifications for decisions made.3 The framework for accountability’s effect is that judges have two mental processes available (heuristic/low-effort and systematic/high-effort) to evaluate judgment criteria. Low-effort judges utilize heuristics or simplified judgment models. These heuristics may lead to erroneous judgments. High-effort judges pay closer attention to the message cues rather than environmental cues, such as the likeability of the person sending the message (Chaiken, 1980). Justification in this regard should elicit cognitive effort without concern toward any perceived audience preference. Through this framework the justification requirement should elicit high mental effort for the judgment. Lord et al. (1984) suggest that to simply apply more effort (e.g., through justification) may not be an adequate mechanism to improve judgments when people do not look at subsequent evidence, or do look at the additional evidence but evaluate it incorrectly. For instance, Lord et al. (1984) suggest people’s beliefs can impact judgment through changing what evidence they consider and how they interpret evidence.4 Thus, simply asking for justification in circumstances where irrelevant information exists (e.g., meeting a dislikeable client) may lead to erroneous evaluation or incorrect weighting of subsequent evidence (relevant fraud or nonfraud). In support of this notion, Schafer (2006) found that client likeability was included in inexperienced auditors’ judgments and was not mitigated by a simple effort-inducing accountability strategy. A second debiasing mechanism that has the potential to mitigate client likeability is an instruction to self-review (consider opposite or alternative explanations) during the judgment process. People tend to use a causal explanation process in generating theories for judgments (Anderson & Sechler, 1986). Anderson and Sechler (1986) propose the strength of a belief for a particular explanation is greater when opposing theories or facts are not elicited. For example in an audit scenario, a person making a fraud judgment may attend to only data that highlight fraud risk, and thus, may rate fraud likelihood higher than if they consider added information that may not indicate fraud. Indeed, Lord, Ross, and Lepper (1979) find that people may discount or even interpret subsequent facts opposite of the true meaning, biased assimilation, to match a preconceived opinion. To correct this tendency of errant data encoding for judgment, Lord et al. (1984)
Justification and Self-Review
47
suggest inducing the decision-maker to consider diametrically opposed or opposite possibilities. By providing an instruction to elicit the decisionmaker to ‘‘consider-the-opposite,’’ Lord et al. (1984) demonstrate that decision-makers make less polarized judgments and judgments incorporating both sides of an argument in their judgment. In an accounting context, Koonce (1992) hypothesizes and finds that auditors in a causal reasoning task (analytical procedures to explain unusual fluctuations in account balances) who focus on one-sided explanations, rate the likelihood of the error cause higher than auditors performing self-review using a counter-explanation. While Koonce (1992) focused on the documentation effect of both explanation and counter-explanation via selfreview, the results provide strong support that auditors instructed to consider the opposite are more likely to include alternative explanation in their judgment. The purpose of utilizing self-review is similar to the requirement for auditors to be skeptical in evaluating audit evidence in fraud (AICPA, 2002; Koonce, 1992). SAS 99 contains the specific instruction for auditors to exercise professional skepticism in their fraud risk judgments (AICPA, 2002: paragraph 13). Skepticism in this sense is withholding judgment until all evidence is collected (Hurtt, 2006). Kennedy (1995) finds that instruction of counter-explanation mitigates a data-related bias of the ‘‘curse of knowledge.’’ Indeed, auditing research has shown that varying forms of self-review (giving alternative explanations, asking to create alternative explanations, and considering the opposite) serve to reduce judgment extremes in auditors’ analytical review tasks (Heiman, 1990; Kennedy, 1995; Koonce, 1992). Reminding a person to self-review may be more effective at inducing a judgment based solely on relevant facts (Kennedy, 1995). Rather than simply informing auditors that they will need to justify their judgments, asking them to consider all possible sides to a choice forces the otherwise nonsalient choices to become more salient. This should cause decision-makers to form reasons and arguments for all possible judgments, and thus, make them less likely to include irrelevant information in their arguments. In this case, self-review should induce decision-makers to consider whether evidence could support a judgment that fraud is likely as well as that fraud is unlikely. Ashton and Kennedy (2002) investigated a similar technique for effortrelated cognitive bias. Within a belief revision task examining recency for auditors’ going-concern judgments, their participants received instructions to review contrary and mitigating factors in the evidence before making a final judgment. They found that the self-review technique mitigated a recency bias. Like the consider-the-opposite strategy, the self-review strategy
48
BRAD A. SCHAFER AND JENNIFER K. SCHAFER
directly encourages the individual to carefully consider and more accurately weigh all evidence prior to making a final judgment. Because client likeability is irrelevant data in a fraud judgment, the self-review debiasing mechanism is predicted to be more effective at mitigating client likeability than a justification mechanism. Formally stated: H2. The self-review strategy will be more effective than the justification strategy in mitigating the influence of client likeability on the fraud risk judgments of inexperienced auditors.
METHOD An experimental setting was used to test the two hypotheses. The experiment examines if (1) client likeability is utilized in the fraud judgments, and (2) selfreview is more effective than justification in mitigating the influence of client likeability in the fraud judgments of inexperienced accountants. Participants Participants included 135 undergraduate senior-level accounting students. These participants have been shown to behave similar to inexperienced staff accountants in prior research (Wright, 2007).5 The students completed the experimental case materials toward the end of the term after completing the course coverage of fraud detection in their upper division auditing course. Design, Variables, and Materials The experiment was a 2 3 design with two levels of affect (like/dislike) manipulated between subjects and three levels of debiasing mechanisms (control (no accountability), justification, and self-review) manipulated between subjects. The dependent variable was the participant’s assessment of fraud likelihood. The participants were randomly assigned to a likeable or dislikeable condition and one of the three accountability/debiasing conditions. The experimental materials consisted of: (1) an introduction page, where accountability was manipulated; (2) a video, where client likeability was manipulated;6 (3) a judgment page, where participants rated the likelihood of fraud; and (4) a debriefing section, where participants answered manipulation checks and demographic questions.
Justification and Self-Review
49
On the introduction page, participants were informed that they were to assume the role of an auditor. They were told that they would view a video from an auditor–client interview and then would be asked to make a judgment about the likelihood of fraud in the financial statements. The case setting was intentionally vague, omitting industry and many other company characteristics, to force the relevant and irrelevant facts to be learned in the video. These facts were five statements eliciting a medium risk of fraud (Eining, Jones, & Loebbecke, 1997). The debiasing mechanisms were manipulated during the instruction to the participant. Schafer (2006) induced accountability through an instruction that participants would be required to list the reasons for their answer. In this study, the justification condition instructions were similar. On the instruction page, participants in the justification condition were informed that in addition to making the overall fraud likelihood judgment, ‘‘you will be asked to list the factors that led to your fraud likelihood judgment. Thus, it is important that you think about the factors that support your fraud likelihood position.’’ To elicit the selfreview cognitive process as a debiasing mechanism, participants in the selfreview condition were instead told that ‘‘you will be asked to list the factors that led to your fraud likelihood judgment. As you watch the video, you should consider why each statement made by the client may, AND why each statement may NOT, be an indicator of fraud.’’ In the self-review condition, the instruction is intended to elicit not only greater effort, but also elicit a counter-explanation to the facts of the case. The manipulation of the debiasing techniques is similar to Kennedy (1995) in that they were provided as part of the task instructions. That is, the actual manipulation is in the instruction rather than the actual written justification and induced self-review. The manipulation intensity may be less in this experiment than Kennedy (1995) in that his instructions noted that responses would be reviewed and participants may be called to explain and justify at a future date. Finally, in the control condition, the participants did not receive information regarding an explanation requirement. In the video portion of the materials, each participant watched a video of an ‘‘audit colleague’’ interviewing the CFO of a potential client of their firm. The video differed only in the first segment, which was the introduction portion of the client meeting. This segment of the video showed the CFO in one of two affect conditions. One group viewed a pleasant, considerate, and interested client (likeable). The other group viewed an abrasive, rude/unpleasant, and inconsiderate client (dislikeable). The second segment of the video showed the client answering several questions indicating a moderate risk of fraud. The moderate risk of fraud, rather than a low or high risk of fraud case, was
50
BRAD A. SCHAFER AND JENNIFER K. SCHAFER
chosen to minimize a floor or ceiling effect on the fraud likelihood scale. The fraud discussion segment was adapted from Eining et al. (1997), and was identical for both groups. After viewing the video, participants provided an overall fraud judgment, an explanation, and answered several demographic questions. The fraud scale was an 11-point judgment scale adapted from Hoffman and Patton (1997). The scale had a lower endpoint of one, indicating very low chance of fraud; a midpoint of six, for an average fraud risk for clients of the firm; and a high end-point of eleven, indicating a very high chance of fraud.
Manipulation Checks To evaluate the experimental manipulation of likeability, participants were asked several questions related to the character of the client in the case. Following the Regan et al. (1974) protocol, the two manipulation check questions for client likeability asked the participant to rate the client’s likeability on a nine-point scale from dislike to like and whether the participant would like to have the client as a colleague on a nine-point scale from definitely no to definitely yes. Because likeability is considered a central construct encompassing many psychological constructs (Brewer & Crano, 1994), a manipulation check comparison was performed in order to ensure that the reason for any differences in participant fraud judgments was attributable to likeability only, and not to characteristics which would be relevant to a fraud judgment. Honesty and sound accounting policies (competence) are specifically stated in the auditing standards as directly related to internal controls and fraud likelihood assessment (AICPA, 2002, SAS 99: paragraph 4). Thus, participants were asked to identify the CFO’s personality characteristics on a five-point scale anchored by ‘‘Definitely No’’ and ‘‘Definitely Yes.’’ These characteristics included kindness, cooperativeness, honesty, competence, and intelligence. Given that the experimental manipulation of likeability is successful, it is expected that honesty, competence, and intelligence will not be different between the likeability conditions.
RESULTS Of the 135 participants, 64 students were assigned to the likeable condition, while 71 were assigned to the dislikeable condition. The manipulation of likeability was successful ( po.0001), with participants’ likeability ratings
Justification and Self-Review
51
higher in the likeable condition than the dislikeable condition for the manipulation questions on: (1) the likeability of the client (mean (standard deviation) 5.90 (1.09) and 3.28 (.80) for the likeable and dislikeable conditions, respectively) and (2) whether the participant would like to have the client as a colleague (mean (standard deviation) 5.24 (1.60) and 2.79 (1.29) for the likeable and dislikeable conditions respectively).7 Furthermore, 91 percent of participants answered correctly on manipulation check questions related to the debiasing conditions. No participants were dropped from the analyses. An analysis of the data excluding the participants who did not answer correctly provides statistically similar results. The mean fraud likelihood judgments and ANOVA results are presented in Table 1. The ANOVA indicates a significant effect of client likeability by debiasing mechanism interaction on fraud likelihood judgment (F ¼ 4.801, p ¼ .010, two-tailed). As indicated in Panel B, the control (nonaccountable) condition participants rated the client more likely to have material fraud when the client was dislikeable (mean ¼ 7.82) than when he was likeable (mean ¼ 6.63). Simple main effects (see Panel C) indicate that the means are significantly different within the control conditions (F ¼ 9.011, p ¼ .004, two-tailed). This result provides support for H1 that inexperienced accountants are influenced by client likeability in a fraud risk judgment. Similar to the control condition, justification condition participants rated the client more likely to have material fraud when the client was dislikeable (mean ¼ 7.95) than when he was likeable (mean ¼ 6.92). Consistent with the theoretical prediction, induced effort through a justification requirement is not sufficient to mitigate the influence of client likeability on fraud judgment (F ¼ 3.804, p ¼ .061, two-tailed in the predicted direction). Conversely, there was not a significant effect of client likeability in the self-review condition. That is, self-review mitigated the influence of likeability on fraud likelihood judgments. Further contrast analysis between the three groups indicates that the judgments of participants in the control group were not significantly different from the judgments of participants in the justification group. However, judgments of participants in the self-review group were significantly different from both the control ( p ¼ .002) and the justification ( p ¼ .027) groups. Taken together, these results support H2. Participants’ mean ratings for the client’s characteristics are indicated in Table 2, Panel A. In a fraud judgment, client likeability should be an irrelevant information cue. Characteristics such as kindness and cooperativeness are expected to be correlated to likeability, but also to be irrelevant to a fraud risk judgment. On the other hand, characteristics such as honesty, competence, and intelligence are relevant to this type of judgment (AICPA,
52
BRAD A. SCHAFER AND JENNIFER K. SCHAFER
Table 1.
Participant Fraud Likelihood Judgment. Panel A: Analysis of Variance
Source of variation
df
Mean square
F-value
p-valuea
Corrected model Likeable (like) Debiasing mechanism (debias) Like debias
5 1 2 2
9.313 6.540 2.955 12.075
3.703 2.600 1.175 4.801
.004 .109 .312 .010
Panel B: Means (Standard Deviations) for Fraud Likelihood Judgmentsb Conditiona Likeable Dislikeable Marginal means
Control
n
Justification
n
Self-review
n
Marginal means
6.63 (1.57) 7.82 (1.73) 7.22
35 34
6.92 (1.11) 7.95 (1.64) 7.53
13 19
7.25 (1.65) 6.44 (1.46) 6.82
16 18
6.84 7.51
Panel C: Relevant Simple Main Effect Comparisons between Groups Effect
df
F-value
p-valuec
Dislike versus like within control condition Dislike versus like within justification Dislike versus like within self-review
1 1 1
9.011 3.804 2.271
.004 .061 .142
a Each participant judged the likelihood of fraud for the client (1 ¼ extremely unlikely, 11 ¼ extremely likely). b The likeable (dislikeable) group consists of students who viewed a likeable (dislikeable) video of a client. The control group had no instruction on a requirement to explain their judgment prior to viewing the video. The justification (self-review) group consists of accountants who were told that they would have to explain (explain and consider all possible outcomes in) their judgments. c p-values are two-tailed.
2002). In order to draw conclusions about client likeability, it is essential to ensure that participants did not mistakenly interpret client likeability as honesty, competence, or intelligence. As expected, the groups differed in their rating of likeability (t ¼ 15.39, po.0001), kindness (t ¼ 5.79, po.0001), and cooperativeness (t ¼ 3.89, p ¼ .0002). The client was not rated significantly different on factors that would be relevant to a fraud likelihood assessment (honesty, competence, and intelligence). In addition, Table 2, Panel B provides a correlation table for the client characteristics. This table indicates that only ‘‘kind’’ and ‘‘cooperative’’ ratings for the client were correlated with the ‘‘likeable’’ rating (r ¼ .5405, po.0001; and r ¼ .4288, po.0001, respectively). These results provide support that
53
Justification and Self-Review
Table 2.
Participant Means (Standard Deviation) for Postexperimental Questions. Panel A: Client Ratingsa Likeable group
Likeability Kind Cooperative Honest Competent Intelligent
5.90 3.82 4.17 3.97 3.69 3.64
(1.09) (.91) (.90) (.92) (.79) (.68)
Dislikeable group 3.28 2.79 3.37 3.88 3.75 3.67
(.80) (1.16) (1.47) (1.57) (1.42) (.74)
df
t-statistic
p-value
133 132 133 133 133 133
15.39 5.79 3.89 .39 .32 .21
o.0001 o.0001 .0002 .6994 .7467 .8356
Panel B: Pearson Correlations ( p-Values) for Client Ratings Likeability
Kind
Cooperative
Honest
Competent
Likeability 1.00 Kind .5405 (.0001) 1.00 Cooperative .4288 (.0001) .4651 (.0001) 1.00 Honest .0406 (.6415) .0194 (.8640) .3443 (.0001) 1.00 Competent .0373 (.6679) .0200 (.8172) .0194 (.8234) .3489 (.0001) 1.00 Intelligent .1518 (.0789) .0846 (.3296) .2052 (.0169) .2193 (.0109) .4511 (.0001)
Intelligent
1.00
a
Likeability was measured on a scale from 1 (strong disliking) to 9 (strong liking), while the remaining characteristics of the client CEO were measured on scales ranging from 1 (definitely no) to 5 (definitely yes).
participants viewed the likeability construct as distinct from honesty, competence, and intelligence, which rules out these variables as alternative explanations for our findings. Other Findings Prior research has demonstrated that experience can mitigate client likeability in judgment (e.g., Bhattacharjee & Moreno, 2002; Schafer, 2006). However, the effects of specialized audit knowledge (i.e., expertise) versus more generalized accounting experience have not been clearly differentiated in client likeability studies. Even though a professional may have many years of generalized accounting experience, performing a specialized task incongruent with the bulk of the accountant’s past training and experience could lead to decreased performance (Shanteau, 1992; VeraMunoz, Kinney, & Bonner, 2001). To examine the potential mitigating effects of general accounting experience on client likeability, 75 practicing CPAs averaging 17.04 years of accounting experience also completed the
54
BRAD A. SCHAFER AND JENNIFER K. SCHAFER
experimental case. Although the professionals were experienced in years, the average percentage of time spent on audit duties (versus other accounting duties) was only 22 percent. Thus overall, the professionals had low specialized audit knowledge. Recent professional standards (AICPA, 2002) suggest that accountants from a variety of backgrounds may be involved in planning discussions and brainstorming where the likelihood of fraud is discussed. As an example, SAS 99, paragraph 17 specifically states ‘‘if the auditor has determined that a professional possessing information technology skills is needed on the audit team, it may be useful to include that individual in the discussion.’’ Similar to the work of inexperienced auditors, the preliminary judgments of nonaudit accountants who provide input to an audit team may influence advanced auditor (partner) judgments (e.g., Ricchiute, 1999). Thus, it is important to understand whether the judgments of these accountants are influenced by likeability. An ANOVA identical to the main experiment was performed for the experienced professionals with statistically similar results. There was a significant interaction between client likeability and debiasing condition (F ¼ 4.44, p ¼ .0386, two-tailed). Participants in the justification condition rated the company significantly more likely (F ¼ 3.83, p ¼ .03, two-tailed) to have fraud when the client was dislikeable (mean ¼ 8.44) than when he was likeable (mean ¼ 7.19). However, in the self-review condition there was not a significant effect of client likeability on fraud likelihood judgments (mean ¼ 6.90, dislikeable; mean ¼ 7.30, likeable). These results indicate that general accounting experience does not mitigate the influence of client likeability on fraud judgments. However, the benefits of self-review extend to this group of professionals. To examine whether specialized audit knowledge mitigates the influence of client likeability on fraud judgments, the analysis of the experienced professionals is further delineated based on participants’ self-reported percentage of audit duties. Individuals who spent less than 20 percent of their time on audit duties and had nonauditing job titles were considered to have no specialized audit knowledge. Since this final delineation results in small sample sizes, nonparametric tests (Kruskal Wallis) were used. Results for participants with no specialized audit knowledge were statistically consistent with the broader sample. However, the mean fraud likelihood judgments of participants who had specialized audit knowledge did not exhibit the client likeability bias in either the justification or self-review condition, suggesting that those with some specialized audit knowledge are less susceptible to the bias.
55
Justification and Self-Review
DISCUSSION As noted by Nelson and Tan (2005), there has been insufficient research considering the influence of affect in the judgment and decision-making of auditors. This study examines the conditions under which affective bias occurs, how the bias affects judgments and potential ways to mitigate the bias. Consistent with prior research on nonfraud audit judgments, this study finds that inexperienced auditors are susceptible to affect biases resulting from client characteristics in their fraud judgments. Specifically, participants rate fraud likelihood to be higher for dislikeable clients than for likeable clients. Two types of debiasing mechanisms were examined in an effort to mitigate the influence of client likeability in a fraud judgment. Inducing effortful thought through justification was not successful at eliminating the influence of client likeability on judgments. However, consistent with mitigating techniques for a data related bias, the influence of client likeability on fraud judgments was mitigated with self-review (a strategy designed to elicit consideration of alternative possibilities). Supplemental findings indicate that general accounting experience does not mitigate the influence of client likeability on fraud judgments. Conversely, specialized audit knowledge does appear to mitigate the bias in this sample. However, these supplemental findings should be interpreted with caution given the small number of participants with specialized audit knowledge. Future research should corroborate this finding using experienced audit professionals who devote all or most of their time to audit and fraud duties. The results of this study have several implications for the accounting profession. Policy makers and advanced members of audit teams need to be aware that how they instruct team members to approach brainstorming or planning meetings can impact the evaluation of client evidence. Inexperienced auditors can process and subsequently relay information from a biased viewpoint, but instructing members to self-review may mitigate the bias. As documented in prior research (e.g., Ashton & Kennedy, 2002; Ricchiute, 1999), these individuals have a key role in collecting and documenting evidence and making preliminary recommendations for senior auditors. Future research should consider how the implications of affect may integrate with recent research examining the implications of SAS 99 (e.g., Carpenter, 2007), such as the identification of fraud risks and the evaluation of evidence. Given that SAS 99 suggests that nonaudit ‘‘specialists’’ may be consulted on audit teams, there is also a need for future research to more
56
BRAD A. SCHAFER AND JENNIFER K. SCHAFER
directly examine the potential affect bias of these individuals. Furthermore, since accounting professionals work in a hierarchical system, research should examine the potential impact reviewer preferences may have in situations where client likeability has the potential to influence judgment processes (Peecher, 1996). These directions may help auditors better understand in what conditions likeability bias persists, and for what circumstances the bias may be mitigated. Finally, this research looks at affect toward a specific target (client likeability). Another category of affect concerns the more general and persistent mood states. Future research in this area should consider the potential for various categories of affect to have differential influences on judgments. Readers should note that debiasing mechanisms that induce accountability are difficult to experimentally manipulate without the existence of real-world benefits or consequences. This paper used manipulations similar to prior research in an attempt to provide the best measure possible under experimental conditions. In this study, both the likeability and debiasing manipulations were subtle, and results could vary in conditions of very strong manipulations. For example, a client that is too kind or too likeable may cause an auditor to question the client’s motivation and induce an opposite effect to the likeability bias found here.
NOTES 1. Client likeability is a specific case of affect, defined as a positive or negative emotional evaluation of a target object or person. 2. Integrity and/or honesty of the client, which this paper views as distinct constructs from client likeability, is relevant to a fraud risk judgment. Consistent with Bhattacharjee and Moreno (2002), this study did not provide any information in the likeability manipulation that could increase the risk associated with the audit, or portray the client to be unreliable. Although the materials show the client to be likeable (dislikeable), they also portrayed them to be truthful and forthcoming with all information. The risk factors in these scenarios were identical for both clients. Similar to Bhattacharjee and Moreno (2002), the experiment creates a situation where the affective information was irrelevant to the auditor’s judgment. This was verified via pretesting and manipulation checks, which will be discussed in the results section. 3. The use of the term justification in our study is operationally differentiated from research focusing on the impact of reviewer preferences (evaluation) on reason giving (e.g., Peecher, 1996). 4. Biased assimilation (Lord et al., 1979), discounting (Regan et al., 1974), and confirmatory search (Klayman & Ha, 1987, 1989) are three examples of how initial belief can adversely impact judgment processing (see also Darley & Gross, 1983).
57
Justification and Self-Review
5. Students have been shown to be appropriate proxies for staff auditors in judgment tasks typically performed by less experienced auditors (e.g., Libby, Bloomfield, & Nelson, 2002; Peecher & Solomon, 2001; Tan, 2001). 6. Two video scenarios were used. Half of the students viewed the likeable (dislikeable) version of the video, while the other half of the participants were on a class break. 7. The manipulation check questions for likeability were significantly correlated (r ¼ .81, po.0001).
ACKNOWLEDGMENTS The authors are thankful for the thoughtful input of the editor, associate editor, and anonymous reviewers. The authors are grateful for the comments from presentations at the 2005 AAA Annual meeting and the USF workshop series. The paper has also been improved by the written comments of Norma Montague, Chris Jones, Lee Kersting, and Linda Ragland.
REFERENCES AICPA. (2002). Statement on auditing standards 99: Consideration of fraud in a financial statement audit. New York, NY: AICPA. Anderson, C. A., & Sechler, E. S. (1986). Effects of explanation and counter explanation on the development and use of social theories. Journal of Personality and Social Psychology, 50(1), 24–34. Ashton, R. H. (1990). Pressure and performance in accounting decision settings, paradoxical effects of incentives, feedback and justification. Journal of Accounting Research, 28(Suppl.), 148–180. Ashton, R. H., & Kennedy, J. (2002). Eliminating recency with self-review: The case of auditors’ ‘going concern’ judgments. Journal of Behavioral Decision Making, 15, 221–231. Bazerman, M., Loewenstein, G. F., & Moore, D. (2002). Why good accountants do bad audits. Harvard Business Review (November), 1–8. Bhattacharjee, S., & Moreno, K. (2002). The impact of affective information on the professional judgments of more experienced and less experienced auditors. Journal of Behavioral Decision Making, 15, 361–377. Bonner, S. E. (1999). Judgment and decision-making research in accounting. Accounting Horizons (December), 385–398. Brewer, M. B., & Crano, W. D. (1994). Social psychology. St. Paul, MN: West. Carpenter, T. (2007). Audit brainstorming, fraud risk identification, and fraud risk assessment: Implications of SAS 99. The Accounting Review, 82(5), 1119–1140. Chaiken, S. (1980). Heuristic versus systematic information processing and the use of source versus message cues in persuasion. Journal of Personality and Social Psychology, 39(5), 752–766.
58
BRAD A. SCHAFER AND JENNIFER K. SCHAFER
Darley, J. M., & Gross, P. H. (1983). A hypothesis-confirming bias in labeling effects. Journal of Personality and Social Psychology, 44(1), 20–33. Eining, M. M., Jones, D. R., & Loebbecke, J. K. (1997). Fraud reliance on decision aids: An examination of auditors’ assessment of management fraud. Auditing: A Journal of Practice and Theory, 16(2), 1–19. Fiske, S. T., & Taylor, S. (1991). Social cognition. New York: McGraw-Hill. Gibbins, M., & Newton, J. D. (1994). An empirical explanation of complex accountability in public accounting. Journal of Accounting Research, 32(2), 165–186. Hastie, R. (1984). Causes and effects of causal attribution. Journal of Personality and Social Psychology, 46(1), 44–56. Heiman, V. B. (1990). Auditors’ assessment of the likelihood of error explanation in analytical review. The Accounting Review, 65(4), 875–890. Hoffman, V. B., & Patton, J. M. (1997). Accountability, the dilution effect, and conservatism in auditors’ fraud judgments. Journal of Accounting Research, 35(2), 227–237. Hurtt, K. (2006). Professional skepticism: An audit-specific model and measurement scale. Working Paper. Baylor University, Waco, TX. Kennedy, J. (1993). Debiasing audit judgment with accountability: A framework and experimental results. Journal of Accounting Research, 31(2), 231–245. Kennedy, J. (1995). Debiasing the curse of knowledge in audit judgment. The Accounting Review, 70, 249–270. Kinney, W. R. (1999). Auditor independence: A burdensome constraint or core value? Accounting Horizons, 13(1), 69–75. Klayman, J., & Ha, Y. W. (1987). Confirmation, disconfirmation, and information in hypothesis testing. Psychological Review, 94(2), 211–228. Klayman, J., & Ha, Y. W. (1989). Hypothesis testing in rule discovery: Strategy, structure, and content. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15(4), 596–604. Koonce, L. (1992). Explanation and counterexplanation during audit analytical review. The Accounting Review, 67(1), 59–76. Lerner, J. S., & Tetlock, P. E. (1999). Accounting for the effects of accountability. Psychological Bulletin, 125(2), 255–275. Libby, R., Bloomfield, R. J., & Nelson, M. W. (2002). Experimental research in financial accounting. Accounting, Organizations and Society, 27, 775–810. Lord, A. T., Lepper, M. R., & Preston, E. (1984). Considering the opposite: A corrective strategy for social judgment. Journal of Personality and Social Psychology, 47(6), 1231–1243. Lord, C. G., Ross, L., & Lepper, M. R. (1979). Biased assimilation and attitude polarization: The effects of prior theories on subsequently considered evidence. Journal of Personality and Social Psychology, 37(11), 2098–2109. Nelson, M., & Tan, H. T. (2005). Judgment and decision making research in auditing: A task, person, and interpersonal interaction perspective. Auditing: A Journal of Practice & Theory, 24, 41–71. Peecher, M. E. (1996). The influence of auditors’ justification processes on their decisions: A cognitive model and experimental evidence. Journal of Accounting Research, 34(1), 125–140. Peecher, M. E., & Solomon, I. (2001). Theory and experimental in studies of audit judgments and decisions: Avoiding common research traps. International Journal of Auditing, 5, 193–203.
Justification and Self-Review
59
Regan, D. T., Straus, E., & Fazio, R. (1974). Liking and the attribution process. Journal of Experimental Social Psychology, 10, 385–397. Ricchiute, D. N. (1999). The effect of audit seniors’ decision on working paper documentation and on partners’ decisions. Accounting Organizations and Society, 24, 155–171. Schafer, B. A. (2006). Affect and accountability in auditor judgment. Working Paper. University of South Florida, Tampa, FL. Schwarz, N. (1990). Feelings as information: Informational and motivational functions of affective states. In: E. Higgins & R. Sorrentino (Eds), Handbook of motivation and cognition: Foundations of social paper (Vol. 2, pp. 527–561). New York: Guilford. Shanteau, J. (1992). Competence in experts: The role of task characteristics. Organizational Behavioral and Human Decision Processes, 53, 252–266. Tan, H.-T. (2001). Methodological issues in measuring knowledge effects. International Journal of Auditing, 5, 215–224. Tetlock, P. E. (1983). Accountability and the perseverance of first impressions. Social Psychology Quarterly, 46(4), 285–292. Vera-Munoz, S. C., Kinney, W. R., & Bonner, S. E. (2001). The effects of domain experience and task presentation format on accountants’ information relevance assurance. The Accounting Review, 76, 405–429. Wright, W. (2007). Academic instruction as a determinant of judgment choice. Behavioral Research in Accounting, 19, 247–259. Wright, A., & Mock, T. J. (1985). Towards a contingency view of audit evidence. Auditing: A Journal of Practice and Theory, 5(1), 91–100. Zajonc, R. B. (1980). Feeling and thinking preferences need no inferences. American Psychologist, 35(2), 151–175.
DO PRINCIPLES- VS. RULES-BASED STANDARDS HAVE A DIFFERENTIAL IMPACT ON U.S. AUDITORS’ DECISIONS? Joann Segovia, Vicky Arnold and Steve G. Sutton ABSTRACT Multiple stakeholders in the financial reporting process have articulated concerns over the rules-based orientation that U.S. accounting standards have adopted. Many argue that a more principles-based approach to standards setting, typified by international accounting standards, would improve the quality of financial reporting and strengthen the auditor’s position when dealing with client pressure, thereby enabling a focus on transparency and fairness of financial reports. In early 2009, the U.S. appeared poised to transition U.S. accounting standards to international accounting standards. The transition decision was made after the recommendations of the SEC Advisory Committee on Improvements to Financial Reporting (i.e., SEC Pozen Committee) publicly expressed strong support in its final report (SEC, 2008a). The SEC in turn issued its ‘‘Roadmap for the Potential Use of Financial Statements Prepared in Accordance with International Financial Reporting Standards by U.S. Issuers on November 14, 2008’’ (SEC, 2008b) outlining the transition procedures. However, with Shapiro taking over as chairperson of the SEC, this move now appears less likely pending a stronger review of how Advances in Accounting Behavioral Research, Volume 12, 61–84 Copyright r 2009 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1108/S1475-1488(2009)0000012006
61
62
JOANN SEGOVIA ET AL.
principles-based international standards may impact the strength of financial regulatory oversight – a potential delay met with disdain by the pro principles-based European regulatory community (Doran, 2009). While transition to international standards continues to progress, little research examining whether principles-based standards affect auditor decision-making has been conducted. The purpose of this study is to explore the impact of principles- vs. rules-based standards on auditors’ willingness to allow preparers leeway in reporting practices and to consider how auditors’ decision behavior is influenced by potential client pressure and/or opposing pressure from the SEC. Based on a sample of 114 experienced auditors, the results show that auditors are more willing to allow clients to manage earnings under rules-based standards; and, these results are persistent even under external pressure. Results also indicate that more experienced auditors are less willing to allow clients who exert high pressure to report earnings aggressively, while SEC pressure has more affect on less experienced auditors. These results provide important insights to the FASB, SEC, and IASB as they weigh arguments underlying the principles- vs. rules-based debate.
INTRODUCTION U.S. GAAP, the world’s most exhaustive body of accounting rules and guidelines, is overburdened with detail, riddled with exceptions and bound up in bright lines, all of which make it vulnerable to manipulation, say its legion of critics. GAAP . . . enabled executives and auditors to structure transactions that ostensibly conformed to accounting rules but violated basic tenets of transparency and fairness. (Osterland, 2005, p. 5)
In late July 2007, then SEC Chairman Cox announced the formation of the SEC Advisory Committee on Improvements to Financial Reporting chaired by Robert C. Pozen (hereafter the Pozen Committee) with the charge to address issues surrounding the complexity and transparency of financial reports (SEC, 2007). One of the first issues specifically targeted was ‘‘the current approach to setting financial accounting and reporting standards.’’ A month later, Pozen (2007) released a draft whitepaper to provide initial guidance to the Pozen Committee and noted specifically the need to consider: (a) a principles- vs. rules-based approach to standards; (b) the embodiment of exceptions, bright lines, and safe harbors in standards; and (c) the provision of timely implementation guidance. The committee’s final report was, not surprisingly, very supportive of a move to reconcile
Principles-Based vs. Rules-Based Standards
63
differences between U.S. GAAP and International Financial Reporting Standards (IFRS) as it advocated ‘‘the continued move to a single set of high-quality global accounting standards’’ (SEC, 2008a). The underlying foundation of IFRS is considered more principles-based requiring professional judgment for implementation, whereas GAAP is considered more rules-based, reducing the need for professional judgment. In parallel, then U.S. Secretary of the Treasury Paulson spear-headed hearings focused on assuring investors that auditing is transparent and sustainable (US Treasury, 2007a), and, in early October 2007, formed the Treasury Advisory Committee on the Auditing Profession (US Treasury, 2007b). As Paulson noted, ‘‘A transparent financial reporting system and vibrant auditing profession form the backbone of a marketplace investors can trust. Any plan to strengthen our capital markets must be based upon this principle’’ (US Treasury, 2007a). In August of 2008, SEC Chairman Christopher Cox announced that the SEC had unanimously approved a proposal to change financial reporting for publicly traded companies from US accounting standards (GAAP) to IFRS (http://www.sec.gov/news/speech/2008/spch082708cc_ifrs.htm). In November 2008, the proposal was followed by the ‘‘Roadmap for the Potential Use of Financial Statements Prepared in Accordance with the IFRS by U.S. Issuers’’ (SEC, 2008b) which outlines a plan for transition from GAAP to IFRS. The Roadmap proposes that publicly traded companies in the United States will be required to adopt IFRS by 2014, although firms can voluntarily adopt as early as 2010. However, the new SEC Chairperson, Mary Shapiro, has been less than enthusiastic for the transition to IFRS and has indicated that it may be postponed and revisited – a position that appears to have caught UK Treasury and European Commission officials by surprise when announced by Erik Sirri (director of the SEC’s trading and markets divisions) at a conference of European bankers in early March 2009 (Doran, 2009). While the U.S. financial reporting community continues to debate the merits of moving the financial reporting process toward more principlesbased international standards, little empirical research has been conducted that examines whether these standards will affect the professional judgment of auditors. The purpose of this study is to focus on the synergy between the reporting system and the strength of the auditor by specifically examining the influence of principles- vs. rules-based accounting standards on auditors’ decisions in allowing aggressive financial reporting by clients. In addition, this study considers the impact of competing pressures on the auditor that potentially arise: pressure from the client to allow aggressive reporting and,
64
JOANN SEGOVIA ET AL.
in the opposite direction, pressure from regulators on the auditor – in this case pressure from the SEC. The accounting scandals that arose in the early 2000s initially fueled the debate over principles- vs. rules-based standards. The international community considers principles-based standards more effective because auditors must exhibit professional judgment by ‘‘thinking through’’ relevant issues and approach an audit with an objectives-based attitude (Piper, 2005). On the other hand, preparers prefer rules-based standards that provide specific guidance for decision-making and benchmarks for preparers and auditors to defend their decisions. The statement made by corporate leaders epitomizes rules-based standards: ‘‘tell me where it says I can’t’’ (Weil, 2002). Many audit professionals also support principles-based standards. Immediately after Enron’s fall, both PricewaterhouseCoopers’ and Grant Thornton’s CEOs noted problems with the rules-based approach used in the U.S. GAAP and publicly proposed switching to principles-based standards (Shortridge & Myring, 2004; D’Andrea, 2003). Even the U.S. Congress acknowledged that quality of earnings, transparency of financial reporting, and investor confidence in financial reporting were major issues and directed the SEC to ‘‘ . . . conduct a study on the adoption by the United States financial reporting system of a principles-based accounting system’’ (U.S. Congress, 2002, Sarbanes-Oxley Act (SOX) Section 108). While reluctant to back away from rules-based standards, both the FASB and the SEC have espoused the potential benefit of principles-based standards (Osterland, 2005). While prior studies have surveyed managers and solicited perceptions on the relative merits of principles- vs. rules-based standards, empirical behavioral work has not examined whether principles- vs. rules-based standards will differentially impact auditors’ decisions (Nelson, 2003). Surveying audit managers and partners, Nelson, Elliott, and Tarpley (2002) found evidence that auditors were not likely to curtail aggressive reporting when the client structured a transaction to meet precise, rules-based standards but noted an absence of experimental studies on this issue. The American Accounting Association Financial Accounting Standards Committee comments that ‘‘correspondingly, auditors are more likely to permit earnings management attempts through transaction structuring to stand when governing rules are precise and the transactions structuring is consistent with the rules’’ (AAA, 2003). Clor-Proell and Nelson (2007) provide evidence that the same issues may come into play if examples for compliance accompany principles-based standards, even if such examples do not actually fit the situation the preparer is addressing.
Principles-Based vs. Rules-Based Standards
65
This study focuses on the interactions of principles- vs. rules-based standards with potential client pressure and/or potential regulatory pressure on auditors’ decision-making in the face of a client’s earnings management efforts. The purpose of this study is to experimentally investigate whether principles-based standards will reduce the auditors’ propensity to allow clients to aggressively report earnings and whether pressure exerted by external constituencies might affect decisions. In particular, will principlesbased standards reduce earnings management? If so, will different types of standards result in different decisions even under external pressure from the client and/or SEC, and will any such difference be affected by the auditors’ level of experience? The study examines the decisions of 114 U.S. auditors. The results indicate that, under rules-based guidance, auditors are more likely to allow aggressive reporting and the difference between principlesand rules-based guidance is persistent under various types of pressure. In addition, client pressure impacts more experienced auditors’ decisions; as the client indicates a greater desire for aggressive reporting, more experienced auditors are likely to require a greater adjustment to reduce the extent of earnings management under both principles- and rules-based guidance. On the other hand, SEC pressure regarding potential investigation affects the decisions of less experienced auditors more than experienced auditors. This study contributes to the financial accounting and auditing literature by providing initial evidence from an experimental setting that principlesbased accounting standards, which provide general, conceptual guidance, may impact auditors’ decisions related to client’s aggressive reporting. These results have important implications for the SEC, FASB, and IASB as these regulatory and oversight bodies address future standards setting.
BACKGROUND AND HYPOTHESES DEVELOPMENT Researchers have endeavored to better understand earnings management behavior by examining preparers’ and auditors’ role in allowing earnings management. Early work focused on archival studies that documented the prevalence of earnings management to meet explicit or implicit earnings benchmarks (see Healy & Wahlen, 1999, for a review) and examined various circumstances under which earnings appeared to increase or decrease (Barton & Simko, 2002; Frankel, Johnson, & Nelson, 2002). However, a major limitation of archival research on earnings management is that preparers’ decisions cannot be separated from auditors’ decisions as both
66
JOANN SEGOVIA ET AL.
are embedded in the resulting financial statements (Nelson et al., 2002). Limited behavioral research has examined auditors’ decisions related to the client’s desire to manage earnings, which could help to disentangle the influence of auditors from that of preparers. Behavioral studies examining earnings management indicate that incentives exist to encourage preparers to manage earnings (King, 2002) and auditors to permit aggressive reporting (Hackenbrack & Nelson, 1996). The formulation of past GAAP has seemingly exacerbated the problem. For instance, numerous inconsistencies exist in current standards for reporting liabilities that provides preparers with a selection of alternatives to structure transactions and manage earnings (Botosan, Koonce, Ryan, Stone, & Wahlen, 2005). Research has specifically considered the rules-based standard SFAS 128 and found empirical evidence of preparers structuring convertible bond offerings to increase diluted EPS (Marquardt & Weidman, 2005). ClorProell and Nelson (2007) find similar indications of transaction structuring when specific examples of implementation accompany a principles-based standard, even when the example does not match the preparer’s specific situation. These ‘‘examples’’ are representative of the ‘‘embodiment of exceptions [and] bright lines’’ that Pozen (2007) specifically suggests should be considered when weighing the usefulness and viability of principles- vs. rules-based standards. Survey data suggests that preparers are more likely to attempt earnings management using precise standards as a basis, and auditors are less likely to require adjustment when clients use precise standards (Nelson et al., 2002) report. Auditors are also more likely to confront greater conflict resulting in more negotiation efforts with the client under flexible standards (Gibbins, Salterio, & Webb, 2001). Auditors frequently do not require adjustment of aggressive reporting because: (1) the client demonstrates compliance with GAAP; (2) evidence that the client’s position is not correct is unavailable; or (3) the amount is not material (Nelson et al., 2002). Research also shows that auditors are more inclined to allow aggressive reporting when contingent economic rents are at stake (Beeler & Hunton, 2002). These findings are particularly important as the SEC and FASB work with the IASB to investigate the feasibility of moving to more principlesbased standards. While survey results provide preliminary evidence that principles-based standards may reduce earnings management, actual behavioral evidence is necessary to better understand the decision-making patterns and the underlying factors that influence those judgments as auditors formulate decisions on the acceptability of their client’s accounting choices and possible earnings management intentions.
Principles-Based vs. Rules-Based Standards
67
Nelson et al. (2002) set forth a theory regarding the impact of precise standards on managers’ aggressive reporting decisions and auditors’ decisions to accept these aggressive reporting decisions or to require adjustments prior to issuing an unqualified opinion. This theory, adapted from economics and law and applied to a financial reporting environment, posits that ‘‘ . . . managers respond to varying levels of rule precision by varying their earnings management attempts to minimize the chance their auditors will require adjustments.’’ In essence, when preparers structure transactions to meet the requirements of a rules-based standard, auditors are less likely to require adjustments. Nelson et al. (2002) further note that the more precise the standard, the more likely preparers will argue over the appropriateness of the accounting treatment and the more likely auditors will accept a client’s explanations. Theory on the sociology of professions supports the notion that principles and ideals rather than specific rules should guide action (Kultgen, 1988). Overemphasizing rules may lead the professional to believe that rules that do not prohibit actions are acceptable and rules that prescribe actions are benchmarks (Kultgen, 1988, p. 31). In essence, the use of rules replaces the professional’s perceived need for judgment and deprofessionalizes the decision process – allowing the rules to subjugate professional judgment. This leads to the first hypothesis. H1. Auditors will allow more aggressive reporting when the authoritative guidance is rules based rather than principles based. Prior research has also shown that the client’s explanation for accounting transactions may affect the amount allowed by auditors. Information obtained through the inquiry of preparers is one of the most pervasive types of audit evidence (Koonce, 1992; Anderson, Koonce, & Marchant, 1994; Hirst, 1994a, 1994b). Auditors have incentives to agree with the client whenever possible in order to maintain client relations and maintain business (Lochner, 1993; Schuetze, 1994). When regulators do not deny different accounting treatments of a similar transaction or provide flexibility in the method used to calculate the accounting treatment, clients can exert strong pressure on auditors to accept their perspective on ambiguous accounting matters (Johnson, Jamal, & Berryman, 1991; Anderson & Koonce, 1995; Shah, 1996). Auditing studies have frequently confirmed that various types of incentives and pressure can affect auditors’ decisions (Farmer, Rittenberg, & Trompeter, 1987; Lord, 1992; Hackenbrack & Nelson, 1996).
68
JOANN SEGOVIA ET AL.
Power struggle theory (Goldman & Barlev, 1974) posits that the client exerts significant influence over auditors. The theory views the relationship between the client and auditors as an asymmetrical power relationship favoring the client in a conflict situation. Nichols and Price (1976) expanded Goldman and Barlev’s analysis theorizing that an asymmetrical dependency pattern exists as the client has a greater number of available alternatives for obtaining an audit. Monger (1981) provides support for Nichols and Price’s assertion concerning the likelihood a client will influence its audit firm’s professional judgment due to the asymmetrical relationship. If auditors do not yield to client’s pressure, the client can impose costs by threatening to or actually changing auditors (DeAngelo, 1981). Thus, clients may exert significant pressure on auditors to force allowance of aggressive earnings reporting. This client pressure may be more effective when auditors feel that the rules specified within a standard subjugate their professional judgment. Thus, auditors may be more willing to agree with the client’s aggressive reporting with rules-based standards which leads to the second hypothesis. H2. Auditors will allow more aggressive reporting when a client exerts greater pressure. As early as 1998, then SEC Chairman Arthur Levitt expressed deep concerns over the intense pressure auditors were under to permit aggressive earnings management by clients (Levitt, 1998, 2000a, 2000b). A number of situations where gray standards left too much negotiation room for preparers accentuated these concerns (Iyer & Rama, 2004). In an effort to mitigate this pressure, the SEC (as a key regulatory oversight body) began using its enforcement capability to exert pressure on auditors to resist client’s aggressive practices and use their professional judgment to question inappropriately aggressive earnings management. To counter client pressure on auditors, draw attention to concerns over earnings management and transparent financial reporting, and discourage preparers from managing earnings, the SEC has, in the past, sent letters to publicly traded companies suspected of earnings management. The letters indicated that the SEC was considering reviewing the companies’ financial statements due to earnings charges that significantly reduced earnings through asset write-downs, restructuring activities, or acquired in-process research and development. Since prior research investigating the effectiveness of enforcement actions by the SEC finds that, as a result of such action, auditing firms experience legal and/or reputation costs (Dechow, Sloan, & Sweeney, 1996; Moreland, 1995; Dopuch & Simunic, 1982; Davis & Simon, 1992; St. Pierre & Anderson, 1984), pressure from the SEC should affect
Principles-Based vs. Rules-Based Standards
69
auditors’ decision to allow their client to aggressively report earnings. This leads to the third hypothesis: H3. Auditors are less likely to allow aggressive reporting when the SEC has expressed concern to the client regarding potential investigation for earnings management. Pressure from regulatory bodies balances to some degree the pressure auditors receive from clients to sign-off on aggressive earnings management. When both the SEC and client exert pressure on auditors, the two pressures will influence auditors’ decisions in opposing directions. Whether pressure from the client or the SEC is stronger, or whether the two types of pressure will have an interactive impact, is unknown. This leads to hypothesis 4: H4. An interaction will exist when the SEC has expressed concern to the client regarding potential investigation for earnings management and the client has exerted greater pressure on auditors to allow aggressive reporting. Fee pressure from clients can create an incentive for audit seniors to focus on cost control over audit effectiveness (Houston, 1999; Public Oversight Board, 1999). Less experienced auditors also show a greater willingness to reduce risk levels and lower audit work to secure additional business services with a client (Moreno & Bhattacharjee, 2003) and place a greater emphasis on client service rather than auditor/practice development (Emby & Etherington, 1996). The socialization and mentoring process of auditors with less experience emphasizes the importance of their visibility in the firm, bringing in new business to the firm, and managing client relationships (Covaleski, Dirsmith, Heian, & Samuel, 1998). Managers and partners, on the other hand, tend to have more concern over litigation risk and the ability to use working papers to adequately support the audit opinion (Johnstone, 2000; Rich, Solomon, & Trotman, 1997; Trompeter, 1994). Caplan and Kirshenheiter (2004) found that auditors’ experience impacted their preference for different types of standards. Less experienced auditors preferred rules-based standards due to the ease of decision-making, whereas more experienced auditors preferred less rules-based standards. These results suggest variance in experience may differentially impact the decisions of auditors when applying rules-based vs. principles-based standards. Experience also appears to impact the ability to deal directly with client pressures. In a study of critical factors impacting audit quality, client
70
JOANN SEGOVIA ET AL.
pressure is one of the major concerns; however, increases in the level of experience of the auditor negotiating with the client mitigates client pressure (Sutton, 1993; Sutton & Lampe, 1991). A follow-up study in internal auditing (Lampe & Sutton, 1994) also provides support for those results. Trompeter’s (1994) results, however, indicate client pressure influences audit partners’ judgments more when the accounting standards are ambiguous and allow more flexibility in interpretation and application. Because pressure from the client and the SEC may affect auditors at varying experience levels differentially, controlling for experience is an important consideration when examining how different types of standards might affect decisions. Variance in experience appears likely to affect decision behavior. Thus, we need to examine the issues in this study with the knowledge that potential experience effects may exist.
RESEARCH METHODS Design and Materials To examine the impact of principles- vs. rules-based standards on auditors’ decisions, an experiment was conducted to determine the amount of expense that auditors would allow following the guidance of two different types of accounting standards. The experiment utilized a 2 (principles- vs. rulesbased standard) by 2 (high vs. low pressure from client) by 2 (absence or presence of SEC pressure) between subjects, repeated measure design and was administered in a computerized format to prevent participants from changing answers once subsequent information had been obtained.1 After an introduction, participants received information about an audit of a hypothetical client – a publicly traded manufacturing firm that had exceeded earnings expectations and desired to reduce current years’ earnings. This included background, the past 5 years of financial data, current year financial data, and financial statement materiality. The type of accounting standard, whether principles- or rules-based, was manipulated using two scenarios in which the client had taken incomedecreasing action. While income-decreasing activities reflect a more conservative measure of earnings management, those activities represent 31% of identified earnings management occurrences (Nelson, 2003). In the first scenario, the matching principle (as defined in Accounting Research Bulletin 43) was used to operationalize a principles-based standard where
Principles-Based vs. Rules-Based Standards
71
the client desired to include an $800,000 expense for supplies that may not have been used. FASB defines a principles-based approach as broadly applied principles with few, if any, exceptions and less interpretive and implementation guidance. Applying ‘‘ . . . professional judgment consistent with the intent and spirit of the standard’’ is necessary (FASB, 2002). The authoritative guidance used to operationalize a rules-based standard, FASB 1212 Accounting for the Impairment of Long-Lived Assets, sets forth specific rules and guidance for determining when an asset is impaired and the amount of impairment loss (FASB, 1995). A rules-based standard is more precise and includes specific criteria, examples, exceptions, thresholds, and/or implementation guidance (Nelson, 2003). The impairment standard provides guidance on: (1) how to measure an impairment; (2) what events might signal an impairment; (3) a recoverability test; (4) how the different evidence might be weighted; (5) the possibility of a range of estimates for the amount or timing of cash flows; and (6) the consideration of the likelihood of possible outcomes. An audit client can use this complex standard to structure justification for their preferred treatment. In addition, researchers often note the write-off of impaired assets is a vehicle used to manage earnings (Levitt, 1998; Riedl, 2004). In the second scenario, the client desired to write off an asset as impaired although the company was still using the asset to generate revenues. In both scenarios, the client desired to reduce income by $800,000 with no support other than management’s assertions. The evidence in the impairment scenario clearly indicates that revenues from the asset have not declined in the current year and are in pace with revenues of the previous year resulting in no impairment loss, while evidence in the supplies case indicates that the supplies have not been used. The underlying facts are the same, other than the details of the proposed expense for supplies or impairment. The broader issue is whether auditors are willing to allow the client to take action that will reduce earnings by $800,000, a material amount, regardless of the scenario. These scenarios use actual authoritative guidance to examine the principles- vs. rules-based issue and its impact on auditors’ decisions. Because we wanted to utilize experienced auditors, we were concerned that the auditors’ knowledge of existing standards would bias their judgment if we used ‘‘contrived’’ standards. We decided that the best way to initially examine this issue was to utilize two different standards as authoritative guidance and make the salient facts as close as possible given the differences in scenario. Participants were first provided with preliminary information and then asked to determine the amount of impairment loss or supplies
72
JOANN SEGOVIA ET AL.
expense to report in the financial statements. The experimental materials included the appropriate authoritative guidance (either SFAS 121 or ARB 43) that governed the particular issue in order to insure that participants were familiar with the specific standard. In the second part of the instrument, two types of pressure were manipulated – pressure from the client and from the SEC. In the high client pressure condition, the client expressed a strong desire to manage earnings and pressured the auditor to allow the additional expense; in the lowpressure condition, the client expressed a desire to fairly present the financial statements and exerted no pressure on the auditor. To manipulate pressure from the SEC, half of the participants received a copy of a letter from the SEC to the client notifying the client of concerns over suspected earnings manipulation, whereas the other half received no letter. This manipulation contained the exact wording of a letter that the SEC had previously sent to 150 companies. The letter addressed companies’ significant charges for asset write-downs, restructuring activities, or acquired in-process research and development. After receiving the additional information, participants again stated the amount of loss or expense to include in the financial statements. In order to provide a level of accountability and motivation to perform the task in this experiment, participants justified each decision in a short memo. Participants also provided general demographic information upon completion of the case.3 Table 1 presents the eight different treatments resulting from this experimental design.
Table 1.
Experimental Design.
Treatment Preliminary Information: Additional Information: Additional Information: Level Type of Standard (Principles- Client Pressure (High SEC Pressure (Present vs. Rules-Based) Pressure; Low Pressure) or Absent) 1 2 3 4 5 6 7 8
Rules-based Rules-based Rules-based Rules-based Principles-based Principles-based Principles-based Principles-based
High Low High Low High Low High Low
Present Present Absent Absent Present Present Absent Absent
Note: Participants were asked to make an initial estimate of the allowable expense or after reading the preliminary information and were asked to revise their estimate after reading all of the additional information.
73
Principles-Based vs. Rules-Based Standards
Participants The participants included 117 auditors from international and regional audit firms and were randomly assigned to one of the eight treatment levels. Three of the responses were omitted from the analysis – one because the diskette was damaged and could not be read and two because the participants failed to pass the manipulation check, leaving 114 useable responses. Fifty-nine participants completed the impairment case and 55 completed the supplies case. The case was administered to 91 participants via a firm contact person and 26 participants in firm training sessions,4 taking approximately 40 minutes to complete. Average experience for all participants was 7.1 years. Table 2 shows the number of participants based on rank within their organization. Overall, participants were experienced and capable of making the required decision.
RESULTS A preliminary examination of the data shown in Table 3 provides the mean responses for both the initial and final estimate of the allowable expense under the two types of standard. Participants using the rules-based standard Table 2. Position In-charge Manager Senior manager Partner
Demographic Information. Number of Participants
Percent of Total
50 39 13 12
43.9 34.2 11.4 10.5
Table 3. Responses to the Amount of Allowable Expense or Loss Summarized by Type of Accounting Standard (Mean; Standard Deviation). Type of Standard Rules-based Principles-based
Number of Participants
Initial Estimate
59 55
$309,410; 313,294 $158,609; 239,782
Final Estimate
Amount of Change
$287,400; 317,406 ($22,011); 162,613 $203,109; 240,915 $44,500; 145,376
74
JOANN SEGOVIA ET AL.
allowed the client to report a greater amount of expense than those using the principles-based standard. Participants made the initial estimate of allowable expense or loss after reading the preliminary information that contained the principles- vs. rules-based manipulation and the materiality estimate of $250,000. On average, participants initially estimated the allowable expense at $309,410 for the rules-based condition and $158,609 for the principles-based condition. Thus, participants in the rules-based condition allowed the client to reduce income by a material amount. Participants in the principles-based standard condition allowed a smaller, immaterial amount as acceptable. Pressure from the client and SEC were manipulated by providing additional information before asking participants to revise their estimate of the allowable expense. On average, participants in the impairment condition reacted to the additional facts by reducing the amount they considered allowable to $287,400, even though the final estimate was still above the materiality level. On the other hand, participants in the principlesbased condition increased the amount considered allowable to $203,109. While the final estimate was still below materiality, this represented a 28% increase in the amount considered allowable. To address the first research question whether auditors will allow clients to be more aggressive when the authoritative guidance is more rules-based rather than principles-based, the participant’s initial estimate was used as the dependent variable with type of standard as the independent variable and years of experience as the covariate (i.e., a control variable). The initial estimate was the best measure to use for this analysis, as the only variable manipulated prior to that decision was the type of standard. The ANCOVA results indicate that amount allowed under the rules-based standard was significantly higher than the principles-based condition ( po.0001) and the experience covariate was not significant ( p ¼ .3522). The results indicate that auditors may be initially inclined to allow aggressive reporting under rules-based standards. The second and third hypotheses address the effects of pressure from the client and SEC on allowing the client to aggressively report earnings, while the fourth examines the interaction of the two types of pressure. As noted earlier, the initial estimate was made after the type of standard was manipulated. The final estimate was made after the pressure manipulations. To examine the impact of these different types of pressure, an ANCOVA analysis was conducted with the final estimate as the dependent variable, type of standard, client pressure, and SEC pressure as the independent variables, and initial estimate and experience as covariates.
75
Principles-Based vs. Rules-Based Standards
Table 4.
ANCOVA Results Based on Final Estimate.
Variable
Calculated F
Significance*
Covariates Initial estimate Experience
253.71 .10
.0001 .7525
Main effects Standarda Client pressure SEC pressure
.17 6.96 3.68
.6799 .0097* .0580*
Interactions Standard * Client pressure Standard * SEC pressure Client pressure * SEC pressure Experience * Standard Experience * Client pressure Experience * SEC pressure Accounting * Client pressure * SEC pressure Experience * Standard * Client pressure Experience * Standard * SEC pressure Experience * Client pressure * SEC pressure Experience* Standard*Client pressure*SEC pressure
.07 .66 .07 .25 5.53 5.11 .68 .90 1.52 .16 .01
.7858 .4174 .7955 .6152 .0207* .0260* .4109 .3444 .2200 .6886 .9078
*Significant. Presented as part of the ANCOVA model, but this is not a meaningful measure for standard as it contains potential interaction for the other two variables, client pressure and SEC pressure.
a
Table 4 presents the results of the ANCOVA analysis for both the main effects and interactions. The covariate, response to preliminary information, was significant ( po.001), which indicates that the participant’s final decision was highly correlated with their initial decision. This means that the initial differences due to the type of standard persisted under both client and SEC pressure. The results also indicate that the pressure from the client and SEC significantly impacted the decision process. In addition, both types of pressure interacted with experience, making conclusions about their impact more complex. To understand the interaction between (1) years of experience and client pressure and (2) years of experience and SEC pressure, additional analysis was required. A linear regression model was developed for each of the eight treatments. The parameters for the variables were used for the slopes and intercepts of linear regressions for each of the eight treatments. The model is summarized below as expected values (means) of the response (amount of
76
JOANN SEGOVIA ET AL.
allowable expense based on additional information) for a given value for the covariate years of experience: EV ¼ Ptl þ ðPe YÞ þ C where EV is expected value of amount of expense, Ptl the parameter estimate for treatment level, Pe the parameter estimate for years of experience, Y number of years of experience, and C a constant for the amount based on preliminary information. Each of the eight treatment levels has a parameter estimate. In addition, the regression model included a parameter estimate for the years of experience within each treatment level. The number of years of experience was entered into the above regression model for the range of 3–20 years experience.5 The final term in the regression model was a constant for the amount of expense allowed based on the preliminary information. This constant resulted in a scaling of the predicted values that was representative of the amount of allowable expense. Based on these regression models, the slopes of the regressions were tested to determine whether significant differences existed and how experience affected the main variables of interest, client pressure and SEC pressure. The hypotheses of equal slopes were tested by a partial F-test that used the regression models as the full model and reduced models that force a common slope for each of the linear regressions at different levels of experience. Four levels of experience were chosen based on the distribution of years of experience: 3 as a low value, 5 as a value near the lower quartile, 8 as a value near the upper quartile, and 15 as a large value. Table 5 presents the results of these tests. Each treatment has a separate regression, and the slopes of the eight regressions are compared. Since the slopes are not the same, the predicted value for the specified years of experience is used in order to make comparisons between the experience and the two independent variables, client pressure and SEC pressure. Client pressure was significant ( p ¼ .0421) for the higher experience level but not for lesser experienced participants. This means that participants with 15 years or more of experience were more influenced by client pressure than participants with less experience. When the client exerts high pressure, more experienced participants react negatively and are less willing to allow the client to report earnings aggressively; but, when the client expresses a desire to report earnings fairly rather than pressure to report aggressively, the more experienced auditors allow the client to report greater expense.
77
Principles-Based vs. Rules-Based Standards
Table 5. Experience
Regression Test of Means. Calculated F *
Significance
Client pressure 3 years 5 years 8 years 15 years
2.23 .67 .61 4.24
.1384 .4155 .4375 .0421*
SEC pressure 3 years 5 years 8 years 15 years
6.13 4.04 .15 2.56
.0150* .0473* .7034 .1127
*Significant at .05.
While SEC pressure is significant, less experienced participants appear to drive that result. SEC pressure is significant at 3 years of experience ( p ¼ .015) and 5 years of experience ( p ¼ .0473). These results indicate SEC pressure influences the decisions of the less experienced participants more and that they are less willing to allow the client to report earnings aggressively when SEC pressure exists. The significance of the initial estimate covariate reveals that initial differences that resulted from the type of standard remains significant even after the exertion of the two types of pressure. While pressure from the SEC affected the less experienced auditors more, and pressure from the client affected the more experienced auditors more, the impact of the pressure affected the auditors equally in both conditions. In other words, the type of standard did not preclude the auditors from reacting to the various types of pressure, but differences due to the type of standard remained significant.
DISCUSSION AND CONCLUSION This study provides substantial insights into the effect of principles- vs. rules-based standards on auditor decision-making in the face of client attempts to manage earnings. This has implications to the SEC as it continues to review the current financial reporting model and formulates recommendations for future accounting standards. Additionally, the results address concerns raised by the Treasury Advisory Committee on the
78
JOANN SEGOVIA ET AL.
Auditing Profession in its consideration of how to maintain a ‘‘vibrant auditing profession [that forms] the backbone of a marketplace that investors can trust’’ (US Treasury, 2007a). The results provide evidence that: (1) auditors are more willing to allow clients to manage earnings when the standard is rules-based; and (2) pressure from the client and/or SEC impacts auditors’ willingness to agree with the client’s planned earnings management behavior. The impact of this pressure is significant under both a principles- and rules-based standard. Still, even in the face of pressure, the difference between the allowed earnings management under principles- vs. rules-based standards is persistent and significant. An analysis of experience effects on each of the two pressure conditions indicates that auditors with different experience levels react to pressure differently. Client pressure affected the more experienced auditors’ decisions as experienced auditors reduced the level of earnings management activity. This finding sheds additional light on earlier findings of differentials between less experienced and more experienced auditors (Caplan & Kirshenheiter, 2004; Lampe & Sutton, 1994; Sutton, 1993; Sutton & Lampe, 1991), demonstrating that experienced auditors take a stronger position against client pressure. Future studies should examine this observed behavior more specifically. Pressure from the SEC impacted the decisions of auditors in the expected direction – reducing the amount of earnings management. However, further examination of experience effects indicates that the shift under SEC pressure was driven by the less experienced auditors; the more experienced auditors exhibited no adjustments to their acceptance levels. This is consistent with prior research findings indicating that less experienced auditors generally react to pressure. More importantly, the results from this study provide additional evidence that regulatory pressure may reduce earnings management behavior. The results suggest that such regulatory letters may be an effective countermeasure to client pressure among the more susceptible, less experienced auditors. Three main policy implications exist. As suspected by the international community, rules-based standards may allow firms to aggressively report earnings more easily as seen with the financial reporting practices of the companies perpetuating the major frauds earlier this decade. Many of these companies were structuring transactions to meet the definition of precise standards and auditors were signing off on the client’s preferred reporting position and, in some cases, assisting the client in structuring the transactions to meet the precise rules or focusing too narrowly on compliance. Principlesbased standards allow the decision-makers to stand back, think, and follow
Principles-Based vs. Rules-Based Standards
79
through with the best method to report the economics of the transaction. This finding is especially important as U.S. policy makers try to reestablish credibility in the financial reporting system and the SEC, FASB, and IASB transition toward principles-based standards. The second policy implication relates to the SEC’s notification of potential investigation. While a notification such as this may not significantly affect more experienced auditors, it may affect less experienced auditors who perform the hands-on audit tasks such as obtaining client’s justification for various choices. If less experienced auditors accept client justifications without question, more experienced auditors may rely on less experienced auditors’ decisions and not question management’s motives. Since the less experienced auditors actually collect the audit evidence, how that evidence is presented to the more experienced auditor can influence the final decision on acceptance of client earnings management activities. This is important to the SEC as it weights alternatives for pressuring auditees and auditors to enhance the quality and reliability of the financial reporting system. A final policy implication relates to another interesting finding. Auditors were willing to permit a material misapplication of GAAP with a rulesbased standard; when the standard was principles based, the auditors were willing to agree to an immaterial misapplication of GAAP. This is consistent with the findings of Libby and Kinney (2000) that audit managers expect their client not to record immaterial audit differences when such differences cause earnings to be reported below the consensus forecast. These results again confirm that the concerns of the SEC regarding misapplications of GAAP are indeed well founded. Nonetheless, both experienced and inexperienced auditors under the principles-based conditions actually advocated reduction of earnings management to a level lower than the materiality threshold.
NOTES 1. Six cases were administered in paper format due to a problem accessing files through the respondents’ computers. Administration of the experiment was closely supervised by one of the authors – as participants finished one part of the case, the documents were placed in a sealed envelope before moving forward. No significant differences were observed between the responses in the paper format vs. the computerized format. 2. SFAS no. 144, Accounting for the Impairment or Disposal of Long-Lived Assets, superseded SFAS no. 121. SFAS No. 144 retained the requirements of SFAS 121 to recognize and measure impairment losses but resolved some implementation issues. The changes did not affect the variables or design of this study.
80
JOANN SEGOVIA ET AL.
3. The instrument was initially pretested with eight auditors in order to determine realism and estimate the amount of time needed for completion. Based on their feedback, revisions were made to include materiality information and to increase uncertainty regarding the impairment. The revised case was then pretested with eight more auditors. No significant changes were made as a result of this second pretest. After the second pretest was completed, the controller of a major Midwest corporation reviewed the instrument for realism and reasonableness. The controller indicated the scenario explanations were realistic. 4. The responses for the cases administered in person and through the contact person were compared and no significant differences were observed. 5. Levels of experience greater than 20 results in predictions of negative expense that is theoretically incorrect because the amount is bounded at zero. Ninety-seven percent of all respondents fell within the 3–20 year range.
ACKNOWLEDGMENTS We appreciate the helpful comments that we received from participants in workshops at the University of Melbourne, University of Connecticut, University of Central Florida, American Accounting Association – Accounting, Behavior and Organizations Mid-Year Meeting, American Accounting Association Midwest Regional Meeting, and Annual Congress of the European Accounting Association. We especially appreciate comments from Donna Bobek, Joseph Canada, Clark Hampton, Amy Hageman, Ana Elon, Jillian Phillips, Karen Teitel, and Mike Willenborg. We would also like to acknowledge feedback of James C. Lampe and Linda Nichols in the design of this study and Ron Bremer for feedback on and assistance with the statistical analysis.
REFERENCES AAA Financial Accounting Standards Committee. (2003). Commentary: Evaluating conceptsbased vs. rules-based approaches to standard setting. Accounting Horizons, 17(1), 73–89. Anderson, U., & Koonce, L. (1995). Explanation as a method for evaluating client-suggested causes in analytical procedures. Auditing: A Journal of Practice and Theory, 14, 124–132. Anderson, U. L., Koonce, L., & Marchant, G. (1994). The effects of source-competence information and its timing on auditors’ performance of analytical procedures. Auditing: A Journal of Practice and Theory, 13(1), 137–148. Barton, J., & Simko, P. J. (2002). The balance sheet as an earnings management constraint. The Accounting Review, 77(Suppl.), 1–27. Beeler, J., & Hunton, J. (2002). Contingent economic rents: Insidious threats to audit independence. Advances in Accounting Behavioral Research, 5, 21–50.
Principles-Based vs. Rules-Based Standards
81
Botosan, C. A., Koonce, L., Ryan, S. G., Stone, M. S., & Wahlen, J. M. (2005). Accounting for liabilities: Conceptual issues, standard setting, and evidence from academic research. Accounting Horizons, 19(3), 159–187. Caplan, D., & Kirshenheiter, M. (2004). A model of auditing under bright-line accounting standards. Journal of Accounting, Auditing, and Finance, 19(4), 523–559. Clor-Proell, S., & Nelson, M. W. (2007). Accounting standards, implementation guidance, and example-based reasoning. Journal of Accounting Research, 45(4), 699–730. Covaleski, M. A., Dirsmith, M. W., Heian, J. B., & Samuel, S. (1998). The calculated and the avowed: Techniques of discipline and struggles over identify in Big Six accounting firms. Administrative Science Quarterly, 43, 293–327. D’Andrea, F. (2003). Grant Thornton’s CEO urges principles-based accounting. Accounting Education.com, October 3. Available at http://www.accountingeducation.com/news/ news4475.html Davis, L., & Simon, D. (1992). The impact of SEC disciplinary actions on audit fees. Auditing: A Journal of Practice & Theory (Spring), 58–68. DeAngelo, L. E. (1981). Auditor independence, ‘low balling’, and disclosure regulation. Journal of Accounting and Economics (August), 113–127. Dechow, P. M., Sloan, R. G., & Sweeney, A. P. (1996). Causes and consequences of earnings manipulation: An analysis of firms subject to enforcement actions by the SEC. Contemporary Accounting Research (Spring), 1–36. Dopuch, N., & Simunic, D. (1982). Competition in auditing research: An assessment. Fourth Symposium on Auditing Research, University of Illinois (pp. 403–450). Doran, J. (2009). US rejects global finance controls. The Observer, March 8. Emby, C., & Etherington, L. D. (1996). Performance evaluation of auditors: Role perceptions of superior and subordinates. Auditing: A Journal of Practice & Theory, 15(2), 99–109. Farmer, T. A., Rittenberg, L. E., & Trompeter, G. M. (1987). An investigation of the impact of economic and organizational factors on auditor independence. Auditing: A Journal of Practice & Theory, 7(1), 1–14. Financial Accounting Standards Board (FASB). (1995). Accounting for the impairment of Longlived assets and for Long-lived assets to be disposed of. Statement of financial accounting standards no. 121. Norwalk, CT: FASB. Financial Accounting Standards Board (FASB). (2002). Proposal: Principles-based approach to U.S. standard setting. File reference no. 1125-001. Norwalk, CT: FASB. Frankel, R. M., Johnson, M. F., & Nelson, K. K. (2002). The relation between auditors’ fees for nonaudit services and earnings management. The Accounting Review, 77(Suppl.), 71–105. Gibbins, M., Salterio, S., & Webb, A. (2001). Evidence about auditor–client management negotiation concerning client’s financial reporting. Journal of Accounting Research, 39(December), 535–563. Goldman, A., & Barlev, B. (1974). The auditor–firm conflict of interests: Its implications for independence. The Accounting Review (October), 707–718. Hackenbrack, K., & Nelson, M. W. (1996). Auditors’ incentives and their application of financial accounting standards. The Accounting Review, 71(1), 43–59. Healy, P., & Wahlen, J. M. (1999). A review of the earnings management literature and its implications on standard setting. Accounting Horizons, 13(4), 365–383. Hirst, D. E. (1994a). Auditors’ sensitivity to source reliability. Journal of Accounting Research (Spring), 113–126.
82
JOANN SEGOVIA ET AL.
Hirst, D. E. (1994b). Auditors’ sensitivity to earnings management. Contemporary Accounting Research, 11(1), 405–422. Houston, R. W. (1999). The effects of fee pressure and client risk on audit seniors’ time budget decisions. Auditing: A Journal of Practice & Theory, 18(Fall), 70–86. Iyer, V. M., & Rama, D. V. (2004). Clients’ expectations on audit judgments: A note. Behavioral Research in Accounting, 16, 63–74. Johnson, P., Jamal, K., & Berryman, G. (1991). The effects of framing on auditor decisions. Organizational Behavior and Human Decision Processes, 50, 75–105. Johnstone, K. M. (2000). Client-acceptance decisions: Simultaneous effects of client business risk, audit risk, auditor business risk, and risk adaptation. Auditing: A Journal of Practice & Theory, 19, 1–25. King, R. R. (2002). An experimental investigation of self-serving biases in an auditing trust game: The effect of group affiliation. The Accounting Review, 77(2), 265–284. Koonce, L. (1992). Explanations and counter-explanations during audit analytical review. The Accounting Review (67), 59–76. Kultgen, J. (1988). Ethics and professionalism. Philadelphia, PA: University of Pennsylvania Press. Lampe, J. C., & Sutton, S. G. (1994). Evaluating the work of internal audit: A comparison of standards and empirical evidence. Accounting and Business Research (Autumn), 335–348. Levitt, A. (1998). The numbers game. Speech at NYU Center for Law and Business, September 28. Available at http://www.rutgers.edu/Accounting/raw/aaa/newsarc/pr101898.htm Levitt, A. (2000a). Renewing the covenant with investors. Remarks delivered at the NYU Center for Law and Business, May 10. Available at http://www.sec.gov/news/speech/ spch370.htm Levitt, A. (2000b). Speech at Conference on Rise and Effectiveness of New Corporate Governance Standards, Federal Reserve Bank of New York, NY, December 12. Available at http://www.sec.gov/news/speech/spch449.htm Libby, R., & Kinney, W. R. (2000). Does mandated audit communication reduce opportunistic corrections to manage earnings to forecasts? The Accounting Review, 75(4), 383–404. Lochner, P. (1993). Accountants’ legal liability: A crisis that must be addressed. Accounting Horizons, 7, 92–96. Lord, A. T. (1992). Pressure: A methodological consideration for behavioral research in auditing. Auditing: A Journal of Practice & Theory, 11(2), 89–108. Marquardt, C., & Weidman, C. (2005). Earnings management through transaction structuring: Contingent convertible debt and diluted earnings per share. Journal of Accounting Research, 43(2), 205–243. Monger, R. F. (1981). Relational characteristics and resolution modes as determinants of constructive and destructive conflict perceptions between audit firms and their clients. Ph.D. dissertation, University of Houston. Moreland, K. A. (1995). Criticisms of auditors and the association between earnings and returns of client firms. Auditing: A Journal of Practice & Theory, 14(1), 94–104. Moreno, K., & Bhattacharjee, S. (2003). The impact of pressure from potential client business opportunities on the judgments of auditors across professional ranks. Auditing: A Journal of Practice & Theory, 22(1), 13–28. Nelson, M. W. (2003). Behavioral evidence on the effects of principles- and rules-based standards. Accounting Horizons, 17(1), 91–104.
Principles-Based vs. Rules-Based Standards
83
Nelson, M. W., Elliott, J. A., & Tarpley, R. L. (2002). Evidence from auditors about managers’ and auditors’ earnings management decisions. The Accounting Review, 77(Suppl.), 175–202. Nichols, D., & Price, K. (1976). The auditor-firm conflict: An analysis using concepts of exchange theory. The Accounting Review (April), 335–346. Osterland, A. (2005). A man of principles. Institutional Investor (January), 1–9. Piper, A. (2005). A matter of principles. The Internal Auditor, 62(5), 62–68. Pozen, R. (2007). Discussion Paper for Consideration by the SEC Advisory Committee on Improvements to Financial Reporting, Securities and Exchange Commission. Available at http://www.sec.gov/about/offices/oca/acifr/acifr_discussion.htm. Retrieved on July 31. Public Oversight Board. (1999). 1999 Annual Report, Stamford, CT: POB. Rich, J. S., Solomon, I., & Trotman, K. T. (1997). The audit review process: A characterization from the persuasion perspective. Accounting, Organizations and Society, 22(5), 481–505. Riedl, E. J. (2004). An examination of long-lived asset impairments. The Accounting Review, 79(3), 823–852. Schuetze, W. (1994). A mountain or a molehill? Remarks by Walter Schuetze to American Institute of Certified Public Accountants: Twenty-First Annual National Conference on SEC Developments. Reprinted in Accounting Horizons, 8, 69–75. Securities and Exchange Commission. (2007). SEC establishes advisory committee to make U.S. Financial Reporting System more user-friendly for investors, United States Securities and Exchange Commission, Washington, DC, June 27. Available at http://www.sec.gov/ news/press/2007/2007-123.htm Securities and Exchange Commission. (2008a). Final report of the Advisory Committee on Improvements to Financial Reporting to the United States Securities and Exchange Commission, United States Securities and Exchange Commission, Washington, DC, August 1. Available at http://www.sec.gov/about/offices/oca/acifr/acifr-finalreport.pdf Securities and Exchange Commission. (2008b). Roadmap for the potential use of financial statements prepared in accordance with International Financial Reporting Standards by U.S. Issuers, United States Securities and Exchange Commission, Washington, DC, November 14. Available at http://www.sec.gov/rules/proposed/2008/33-8982.pdf Shah, A. (1996). Creative compliance in financial reporting. Accounting, Organizations and Society, 21, 23–41. Shortridge, R. T., & Myring, M. (2004). Defining principles-based accounting standards. The CPA Journal, 74(8), 34–38. St. Pierre, K., & Anderson, J. (1984). An analysis of factors associated with lawsuits against public accountants. The Accounting Review, 59, 242–263. Sutton, S. G. (1993). Toward an understanding of the factors affecting the quality of the audit process. Decision Sciences (January/February), 88–105. Sutton, S. G., & Lampe, J. C. (1991). A framework for evaluating process quality for audit engagements. Accounting and Business Research (Summer), 275–288. Trompeter, G. (1994). The effect of partner compensation schemes and generally accepted accounting principles on audit partner judgment. Auditing: A Journal of Practice & Theory, 13(Fall), 56–68. United States Congress, House of Representatives. (2002). Sarbanes-Oxley Act of 2002. 107th Congress, 2nd Session, Washington: GPO, 2002. Available at http://news.findlaw.com/ hdocs/docs/gwbush/sarbanesoxley072302.pdf
84
JOANN SEGOVIA ET AL.
United States Treasury (2007a). Paulson announces first stage of Capital Markets Action Plan, US Treasury. Available at http://www.treas.gov/press/releases/hp408.htm. Retrieved on May 17. United States Treasury (2007b). Paulson announces auditing committee members to make recommendations for a more sustainable, transparent industry, US Treasury. Available at http://www.treas.gov/press/releases/hp585.htm. Retrieved on October 2. Weil, R. L. (2002). Fundamental causes of the accounting debacle at Enron: Show me where it says I can’t. Summary of Testimony for Presentation February 6, 2002, House Committee on Energy and Commerce.
A COMPARISON OF ELICITATION METHODS FOR PROBABILISTIC MULTIPLE HYPOTHESIS REVISION Craig Emby ABSTRACT The evaluation of competing hypotheses is an essential aspect of the audit process. The method of evaluation and re-evaluation may have implications for both efficiency and effectiveness. This paper presents the results of a field experiment using a case study set in the context of a fraud investigation in which practicing auditors were required to engage in multiple hypothesis probability estimation and revision regarding the perpetrator of the fraud. The experiment examined the effect of two different methods of facilitating multiple hypothesis probability estimation and revision consistent with the completeness and complementarity norms of probability theory as it applies to the independence versus dependence of competing hypotheses and with the prescriptions of Bayes’ Theorem. The first method was to have participants use linear probability elicitation scales and receive prior tutoring in probability theory emphasizing the axioms of completeness and complementarity. The second method was to provide a graphical decision aid, without prior tutoring, to aid the participants in expressing their responses. A third condition in which participants used linear probability elicitation scales but received no tutoring in probability theory, provided a benchmark against which to assess the effects of the two treatments. Advances in Accounting Behavioral Research, Volume 12, 85–108 Copyright r 2009 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1108/S1475-1488(2009)0000012007
85
86
CRAIG EMBY
Participants receiving prior tutoring in probability theory and using linear probability elicitation scales complied in their estimations and revisions with the probability axioms of completeness and complementarity. However, they engaged in frequent violations of the normative probability model and of Bayes’ Theorem. They did not distribute changes in the probability of the target hypothesis to the nontarget hypotheses, and they engaged in ‘‘eliminations and resuscitations’’ whereby they eliminated a suspect by assigning a zero probability to that suspect at an intermediate iteration and resuscitated that suspect by reassigning him or her a positive probability at a later iteration. The participants using the graphical decision aids, by construction, did not violate the probability axioms of completeness and complementarity. However, with no imposed constraints, the patterns of their revisions were different. When they revised the probability of the target hypothesis, they revised the probabilities of the nontarget hypotheses. They did not engage in eliminations and resuscitations. These patterns are more consistent with the norms of probability theory and with Bayes’ Theorem. Possible explanations of this phenomenon are proposed and discussed, including implications for audit practice and future research.
INTRODUCTION This paper presents the results of a field experiment using a case study set in the context of a fraud investigation that required auditors to perform multiple hypothesis probability estimation and revision. The experiment examined the effect of two different methods of facilitating auditors’ multiple hypothesis probability estimation and revision consistent with the completeness and complementarity axioms of probability and with the norms of Bayes’ Theorem. The evaluation of competing hypotheses is an integral aspect of the audit process. Initial estimation of the probabilities of competing hypotheses is followed by the revision of the probabilities of alternative hypotheses or explanations for an identified event as new evidence becomes available. Analytical review, which often requires an auditor to identify the source of an unidentified fluctuation in a client’s account balance (Asare & Wright, 1997a), is a commonly encountered example of multiple hypothesis evaluation. The setting of this case study, a fraud investigation where the auditor is attempting to identify the perpetrator of the fraud, is a more dramatic example of the same application.
Elicitation Methods for Probabilistic Multiple Hypothesis Revision
87
The normative model of probabilistic revision is not the only model for hypothesis estimation and revision. Other models exist that have application in such circumstances. In particular, a highly developed alternative model is the belief-function model, which is less restrictive in its covenants. For instance, it does not necessarily require the beliefs of the alternatives to sum to one and may be applied in circumstances where the evidence and/or the alternatives are not necessarily independent. The belief-function model has its genesis in Shafer (1976) and has been extended and further developed in Dutta and Srivastava (1993), Shafer and Srivastava (1990), Srivastava, Wright, and Mock (2002), and Srivastava and Shafer (1992). Comprehensive discussion and examples of the application of the belieffunction approach to audit hypothesis evaluation in complex settings can be found in Shafer and Srivastava (1990), Srivastava and Shafer (1992) and Srivastava et al. (2002). However, the results of this study suggest that in cases where the structure of the alternatives in the revision task satisfy the constraints of exhaustiveness (the set of alternatives can be completely specified) and mutual exclusivity (one and only one of the alternatives is true), and where each of the items of evidence relates to one and only one of the alternatives, the normative probabilistic model may have descriptive validity. As Asare and Wright (1997b) point out, whether auditors use a complementarity-based strategy or independence strategy for multiple hypothesis revision (MHR) may have implications for audit efficiency and effectiveness. Using a complementarity strategy implies that the auditor recognizes that direct evidence about one hypothesis is also indirect evidence about competing hypotheses. If the auditor interprets a particular item of evidence as increasing the probability of a particular hypothesis, and reduces the probability of competing hypotheses accordingly, he or she may feel it appropriate to eliminate certain other hypotheses based on their resulting reduced probabilities. This is normatively appropriate and maximizes judgment efficiency in terms of both information requirements and timeliness. Under the independence, or ‘‘one-hypothesis’’ approach, in a similar scenario the auditor would not be willing to revise the probability of the competing hypotheses downward unless he or she obtained direct evidence about those hypotheses. This implies a much more exhaustive (i.e., time consuming and expensive) evidence search process. Waller (1994) suggests that the more exhaustive information search under the independence approach could be a trade-off against the cognitively more complex (and therefore costly) analysis required by the complementarity-based approach. Reluctance to dismiss or eliminate a hypothesis until direct evidence about
88
CRAIG EMBY
that hypothesis indicates that it is appropriate to do so implies a trade-off of efficiency in favor of perceived effectiveness. Patterns of participants’ hypothesis revision in prior research have been compared to the norms of axiomatic probability and of Bayes’ Theorem. The observed noncompliance with the axioms and the theorem has been interpreted as strongly supportive of the one-hypothesis approach, an interpretation supported by motivation-related arguments such as auditors’ bias towards conservativism. However, following the suggestion of Waller (1994), the patterns observed in previous research may have been at least in part a consequence of the mode the participants used to express their estimations and revisions and the arithmetical requirements imposed thereby. One of the response modes used by the auditors in this study to express their estimations and revisions removes the need to incur the cognitive processing cost necessary to ensure arithmetical compliance with the probability axioms and Bayes’ Theorem. The next section reviews some of the prior literature on MHR, leading to the statement of the hypotheses and research questions. The section following describes the experiment, including experimental design, and details of the administration. Following that are the results; the discussion section provides some observations and tentative conclusions, discusses the limitations of this study, and suggests avenues for future research.
BACKGROUND AND PRIOR RESEARCH The normative model for MHR is the Probabilistic Judgment Paradigm (PJP). Two of the main axioms of the PJP are completeness (SP(Ai) ¼ 1) i Þ ¼ 1Þ. An important consequence of and complementarity ðPðAi Þ þ SPðA the axioms is that changes in the probability of one alternative must be accompanied by offsetting changes in the other direction, in sum, to the probability(ies) of the other alternative(s) to maintain completeness and complementarity. Bayes’ Theorem implies that when a particular alternative is eliminated from consideration (i.e., assigned a zero probability) it can never be reinstated. Numerous previous studies in cognitive psychology (e.g., Robinson & Hastie, 1985; Teigen, 1974a, 1974b, 1983; Van Wallandael, 1989) have examined the question of how individuals evaluate competing hypotheses regarding the cause of an event. The results have appeared to show that individuals consistently violate the axioms of probability, particularly complementarity, regarding the reallocation of probabilities to nontarget
Elicitation Methods for Probabilistic Multiple Hypothesis Revision
89
hypotheses when a change is made to the probability of a target hypothesis and with the implications of Bayes’ Theorem regarding the elimination of one or more nontarget hypotheses. Studies in auditing (e.g., Asare & Wright, 1997a, 1997b; Bhattacharjee, Kida, & Hanno, 1999; Heiman, 1990; Srivastava et al., 2002) have essentially shown the same results. The overwhelming violation of the complementarity axiom was supra-additivity. If the evaluator increased the probability of a particular hypothesis, there was no corresponding decrease in the probabilities of competing hypotheses. This apparent violation of normative patterns supports Waller’s (1994) cognitive complexity argument. The overall focus of this study is to examine whether there are differences in the auditors’ patterns of probability revision resulting from the use of different response modes and how, or if, the participant auditors’ patterns of probabilistic hypothesis revision are consistent with the axioms and related theorems of probability revision. If in fact the independence approach to evaluating hypotheses is a response to the high cognitive costs of constantly updating the probabilities of all hypotheses under consideration, providing a way for auditors to perform the updating without the cognitive strain (Gavanski & Hui, 1992) may reveal a more normative approach in their thought patterns. One approach used in past research to promote adherence to completeness and complementarity is simply to ensure that the judges are familiar with the ‘‘rules’’ of probabilistic revision and to instruct them to follow those rules. The results of Robinson and Hastie (1985), who used linear probability elicitation scales combined with just such an approach, indicated that it works well. With almost no exceptions, the participants in their study adhered to complementarity at each iteration of the hypothesis revision task. However, Robinson and Hastie observed other non-Bayesian revision patterns in their participants who used linear probability elicitation scales, received tutoring, and did comply with complementarity. Specifically, they observed what they termed ‘‘eliminations and resuscitations.’’ Participants eliminated a suspect by assigning them a zero probability at iteration ‘‘x,’’ and reassigned them a nonzero probability at iteration ‘‘x þ 1’’ (or ‘‘x þ 2’’ . . . ). Such a pattern is a contradiction of Bayes’ Theorem – once the probability of an alternative is reduced to zero, it must be permanent. Consideration of the general formula that describes Bayesian probability, P(Hi|D) ¼ P (D|Hi)P(Hi)/P(D), shows that once P(Hi) ¼ 0, the numerator, and hence the value of the entire expression, will subsequently always be equal to zero. The results of Robinson and Hastie (1985) and Asare and Wright (1995, 1997b), suggest strongly that the tutored auditor-participants in the linear
90
CRAIG EMBY
probability elicitation conditions will engage in eliminations and resuscitations. This is consistent with Wright (1974) who suggested that individuals operating under a higher cognitive processing load might simplify their task by attending to a reduced set of data. The replication of the linear probability elicitation tutored condition provides a set of responses that can be contrasted with the responses of the participants in the other experimental condition. There has been research in auditing on the effect of alternative information presentation modes on auditors’ judgments (Anderson & Kaplan, 1992; Anderson & Reckers, 1992; see also Moriarity, 1979, who studied the phenomenon in the context of financial accounting, and Cardinaels, 2008 who studied the effect in the context of cost accounting). However, the effect of alternative response modes on auditors’ judgments has not been studied. The alternative response mode used in this study, the ‘‘pie-chart’’ or circle graph, is intended to address that gap. The circle graph was chosen for theoretical and practical reasons. For example, Torgerson (1958) suggested that the difficulty or ease of use of different forms of data representation could be visualized in terms of scale complexity, which he described as a function of the number of cognitive scale elements (e.g., units, origin, distance) that must be processed. Gnanadesikan (1980) proposed criteria for judging the relative merits of various alternative graphical modes, such as descriptive capacity, potential for internal comparisons, and aid in focusing attention (see also Ives, 1982; Vessey, 1991). Bowman (1968) suggested that the benefits of the circle graph in particular stemmed from its ability to show ‘‘ . . . separate segments in relation to a visually unified whole’’ (p. 173); and, Chernoff (1978) suggested that decision aids that have a strong visual component are beneficial in portraying ‘‘ . . . a more complete and better-balanced understanding’’ (p. 4). By removing the need for the participants to engage in arithmetical processing at either the input or output stage of their judgments, the circle-graph response mode provides subjects with a method for making and expressing their revisions which should, according to the above criteria, reduce the level of cognitive complexity. Based on the above arguments, the circle-graph mode may promote a more ‘‘holistic’’ view of the evidence and the suspects. This in turn may facilitate recognition of the interdependence of the subjects and the implications of a change in the probability estimate for one suspect on the probability estimates of the other suspects. On a more pragmatic note, the circle-graph form of data representation is one that is familiar to the auditors. It seems like a very reasonable assumption that all of the participants would be familiar with its use and interpretation.
Elicitation Methods for Probabilistic Multiple Hypothesis Revision
91
If the eliminations and resuscitations observed by Robinson and Hastie (1985) are not present in the revision patterns of the participants in this study, such behavior may have represented a way of reducing the cognitive complexity referred to by Waller (1994), rather than evidence of nonnormative processing. The circle-graph response mode does not by design ‘‘force’’ the reallocation of probability to more than one other suspect. Generally, to make one segment larger (smaller), the judge needs to take (give) additional probability from (to) only one other segment. It is true that mechanically, in revising the probability of the target hypothesis upwards, the judge may have to take probability from more than one alternative to have what they perceive as a sufficient increase. This sort of circumstance would give the appearance of more reallocations but would, in reality, be an artifact of the experimental design. However, this is mitigated in this case by the fact that the evidence was deliberately relevant but not overwhelming, as shown by the manipulation check (discussed in a later section), and that the resulting revisions to the probability of the target hypothesis were not so great as to produce this effect.1
HYPOTHESES AND RESEARCH QUESTIONS The formulation of the specific hypotheses and research questions was guided by previous research findings in probabilistic hypothesis revision and by expectations based on the literature regarding the efficacy of graphical representation. The hypotheses are stated in the direction of expectation. H1. The participants using the linear probability elicitation response mode, but not receiving tutoring, will not comply with completeness and complementarity in their probability estimations and re-estimations. The evidence of prior studies, both in auditing (e.g., Asare & Wright, 1995, 1997b) and in cognitive psychology (e.g., Robinson & Hastie, 1985), suggests strongly that these subjects will not comply with the axioms of the PJP. However, it is necessary to investigate/confirm H1 to establish a benchmark for comparison with the other experimental treatments. H2. The participants using the linear probability elicitation response mode, and receiving tutoring and instructions, will comply with completeness and complementarity in their probability estimations and re-estimations.
92
CRAIG EMBY
This is a replication, in an auditing setting, of the results reported in cognitive psychology by Teigen (1983) and Robinson and Hastie (1985). They found that the participants who received prior tutoring in the axioms of the PJP did comply with completeness and complementarity in the initial and subsequent probability estimations and re-estimations. The evidence from these prior studies shows that tutoring is a strong manipulation and very effective. There is no reason to expect otherwise here. The reason for supplementing the tutoring with instructions is to ensure that the participants comply to allow examination of their responses as a result of that compliance. This leads directly to the statement of Research Question 1. RQ1. Will different response modes be associated with different patterns of probability revision? In particular, will the participants in any of the groups (1) distribute changes in probability estimates of the target hypothesis to alternative hypotheses or (2) display eliminations and resuscitations?2 Examination of the responses of the participants may also disclose other patterns that are not anticipated. If so, such observations will be discussed in an additional analysis section. RQ2. Will the participants using the different response modes show any differences in confidence in their final overall judgments of the probabilities of guilt of the suspects?
RQ3. Will the participants using the different response modes perceive any differences in the difficulty of using the different modes to express their probabilities of guilt of the different suspects? One of the potential byproducts of a response method that stresses the dependence of alternative hypotheses in probability revisions is that it may result in a more complete consideration of the evidence as it pertains to the overall set of alternatives (Bowman, 1968; Chernoff, 1978). The participants were asked at the end of the experiment to indicate how confident they were in their final judgment. As a measure of the perceived cognitive complexity of the different response modes, participants were asked how difficult they found the response mode to use.
Elicitation Methods for Probabilistic Multiple Hypothesis Revision
93
THE EXPERIMENT Experimental Setting The context of the study was a fraud investigation with five suspects. A fraud investigation setting was chosen for two reasons. For internal validity, an unambiguous setting where the alternatives were clearly defined was important. The case states that the perpetrator must be one of the five suspects and that the perpetrator acted alone. Such circumstances create a set of competing exhaustive and mutually exclusive hypotheses. Imposing this constraint limits the generalizability of the results, but this setting is one where the set of suspects could be exhaustive and mutually exclusive, two conditions that eliminate the possibilities of interdependencies amongst the alternatives and facilitate the revision of hypotheses according to the axioms of the PJP and Bayes’ Theorem.3
Participants The participants were 105 practicing auditors from eight Chartered Accounting firms in a major Canadian city. The purpose of the experiment was explained to the participants as ‘‘to observe how individuals make judgments based on incomplete information.’’ The experiment was conducted in the offices of the participants’ firms. Every administration of the experiment was done by the author. At the end of each experimental session, the participants were asked to furnish certain demographic background information such as length of experience, position in their firm, and educational background. The participants were assured of the anonymity of their responses. Table 1 presents descriptive statistics on the participants.
Experimental Design The experimental design was a between-participants 3 1 ANOVA. One cell represented the benchmark – it provided the participants with a set of linear probability elicitation scales from 0% to 100% on which to record their probabilities of guilt for each of the five suspects. The observed results were expected to be consistent with those of previous studies (Asare &
94
CRAIG EMBY
Table 1.
Respondent Demographics. Condition
Linear-Scale Tutored Panel A – by position in the firm Staff Senior/Supervisor Manager/Partner Totals
9 16 10 35
Panel B – by level of experience (in years) r2 10 W2o6 17 Z6 8 35 Totals
Linear-Scale Nontutored
Circle Graph
8 15 12 35
8 15 12 35
10 13 12 35
13 14 8 35
Wright, 1995, 1997a; Robinson & Hastie, 1985; Teigen, 1983), all of which showed that the participants responded as if they were treating the competing hypotheses as independent. The other two cells represented two different approaches to promoting recognition of the dependence of the competing hypotheses (e.g., compliance with the axioms of probability and with Bayes’ Theorem regarding hypothesis revision). The first treatment was to provide explicit tutoring in the PJP focusing on the axioms of completeness and complementarity. The tutoring was reinforced with reminders to the participants in this condition to follow complementarity in their probability revisions. Participants in this condition expressed their probability revisions using the same set of linear probability elicitation scales as the participants in the baseline condition. The second treatment was to provide a response mode that facilitated the participants’ consistency with the requirements of the PJP (i.e., completeness and complementarity) without the necessity for them to engage in any mathematical manipulations. The response mode provided was a circle graph on which the participants indicated, by the size of the segments they drew, the probability of guilt of each of the five suspects. Graphical representation of quantitative data has been used in a number of businessrelated disciplines (e.g., Jarvenpaa & Dickson, 1988; Remus, 1984, 1987; Schulz & Booth, 1995) where it has been generally found to lead to superior judgments; circle graphs, particularly to those in business, are an extremely familiar form of graphical representation.
Elicitation Methods for Probabilistic Multiple Hypothesis Revision
95
The auditors were required to estimate and revise their subjective probabilities of guilt of five suspects in a fraud investigation. There were eight iterations, the initial situation and seven additional pieces of evidence. The evidence related to motive, means, and opportunity, but was not a direct statement of the guilt or innocence of any suspect. The experimental materials were developed and refined in an extensive iterative process in consultation with four practicing senior partners (independent of the experimental participants) who in fact suggested most of the items of additional evidence.
Administration The participants received prior training appropriate to their experimental condition. All participants were introduced to the response mode they would use in the experiment through an exercise similar to the experiment but on a different topic – a murder mystery adapted from Robinson and Hastie (1985) – that required a set of initial probability estimations and one set of re-estimations. The participants in the benchmark condition used the response scales to be used in the experiment (a set of linear 10 cm. scales running from ‘‘0.0 – no chance’’ to ‘‘1.0 – sure thing,’’ marked off in 10 equal intervals). They received no instructions in how the scales ‘‘should’’ be used normatively, to avoid influencing subsequent responses. The participants in the tutored condition used the same scales as described above in the exercise and experiment. The pre-experimental exercise emphasized adherence to the axioms of completeness and complementarity in their revisions. The participants using the circlegraph response mode performed the exercise using graphs (one for the initial estimation and one for the re-estimation) marked on their circumference into 10 equal segments. As in the benchmark condition, they received no instructions in how they should be used, to avoid influencing subsequent responses. Each participant received an experimental package. On the first page of the package was a brief narrative description of the client and the situation – an ongoing audit in which evidence of a material fraud had been discovered. The narrative provided a brief description of what had been discovered to date and brief descriptions of five suspects. The participants read that senior management had requested that a further investigation be undertaken to determine the extent of the fraud and the identity of the perpetrator. They also read that it had been determined for certain that the guilty party must be one of five suspects and that there was no possibility of collusion between the suspects.
96
CRAIG EMBY
Based on the initial evidence, the participants were asked to estimate their subjective likelihoods of guilt for each of the five suspects. Subsequent to this initial estimation, eight additional pieces of evidence were presented to them one at a time, such as might be disclosed by additional investigation. The evidence was ‘‘circumstantial’’ in that it pertained to means, motive, and opportunity. After each piece of evidence, the participants were asked to re-estimate their subjective likelihoods for each of the five suspects. Each additional piece of evidence was direct (confirming or disconfirming), about one and only one of the suspects, and was presented on a separate page, to reflect the manner in which an investigation might turn up different pieces of evidence over time. The diagnosticity of the information was subject to a manipulation check (see footnote 1 and the discussion under Manipulation Check). For participants using the linear probability elicitation scales, the five suspects were listed immediately below the initial narrative (for initial estimation) and again on each page following the additional piece of evidence. The order of the list of suspects was randomized for each presentation. Below each suspect’s name was a rating scale (described above) that the participants used to record their subjective likelihood of guilt for that suspect. The experimental material was identical for the two groups, with one exception. A printed reminder that the sum of the probabilities should be equal to 1.0 was shown at the bottom of each page requesting estimation of probabilities given to the tutored condition.4 For the circle-graph condition, the suspects were listed as for the other two groups. Beside the list was a circle graph with the circumference marked into ten equal sections (as in the pre-experimental illustrations) and the center marked with a ‘‘.’’ The participants were instructed to divide the circle graph into segments representing their subjective likelihoods of guilt for each of the suspects and to mark the segments with the name or initials of that suspect. All participants were provided with a pencil and an eraser. Participants were reminded that, although they were not to go back and change a previous set of estimations after they had started the next set, they were free to change the ‘‘current’’ set as many times as they wished, until they were satisfied with it. In addition, the participants in circle-graph condition were verbally reminded that if they felt the probability of a particular suspect to be zero, they were free to exclude that suspect from the circle graph. After the eighth iteration, all the participants read that it was now the end of the day and that the investigation would continue tomorrow. However, senior management was interested in knowing their impressions based on
Elicitation Methods for Probabilistic Multiple Hypothesis Revision
97
the evidence so far. The participants were asked to indicate how confident they were about their final overall assessments on a continuous scale anchored by ‘‘1 – very unconfident’’ and ‘‘7 – very confident,’’ and to indicate how difficult they felt the task to be on a continuous scale anchored by ‘‘1 – not difficult’ and ‘‘7 – very difficult.’’ In order for the analysis of participants’ probability revisions to be meaningful, it was essential that the information provided subsequent to the baseline estimation be perceived by the participants as diagnostic. To test whether this objective was achieved, after the completion of the experiment, the subjects were asked to provide information for a manipulation check. Each of the five pieces of additional evidence were rated on a continuous scale from ‘‘1 – irrelevant’’ to ‘‘7 – very relevant’’ to indicate their perception of the importance of each in relation to the assessment of guilt of the suspect. Mean values for each piece of evidence were computed for each condition and were compared by way of a one-way MANOVA. The evidence was perceived similarly by each group (F ¼ 1.681. p ¼ .268). The average ratings for the eight pieces of evidence ranged from a low of 3.75 to a high of 5.65. The change in probability assessments ranged from 4.95% to 14.19%. The coefficient of correlation between the relevancy ratings and the changes in probability was significant at po.02 using either r or r.
RESULTS Belief Revision Patterns of Linear Probability Elicitation Scale – Untutored Participants H1, as stated, was unequivocally supported. The untutored condition using the set of linear probability elicitation response scales departed markedly from completeness and complementarity. The sum of the estimates by the participant with the highest estimates ranged from 310% to 360% over the nine assessments, with an average of 336%. One subject appeared to adhere to completeness and complementarity (allowing a margin for experimental imprecision) with a range from 85% to 105% and a mean of 97%. Two other participants’ responses were subadditive, but the overwhelming majority of responses were supra-additive. The grand mean of the 35 subjects over the nine assessments was 205%. Table 2 shows the results for this condition for each of the nine iterations and Fig. 1 presents the same information graphically.
98
CRAIG EMBY
Table 2. Mean Probability Estimates for Participants in the Linear-Scale Nontutored Group. Iteration (Values Expressed in %)
Mean Std. Dev. Low High
Initial
x1
x2
x3
x4
x5
x6
x7
x8
201 81.3 90 360
208 78.8 90 365
216 72.6 105 350
209 73.8 95 330
215 75.3 90 340
196 73.3 90 330
211 76.1 90 350
195 69.1 80 340
196 69.2 90 340
Note: No single participant was consistently highest or lowest – the 18 responses are from 14 different participants.
Belief Revision Patterns of Linear Probability Elicitation Scale – Tutored Participants The results unequivocally support H2 as stated – the auditor-participants could be trained to make decisions within the constraints of the axioms of probability. Each of the participants using the linear probability elicitation and receiving prior tutoring in probability theory complied with the axioms of completeness and complementarity in their initial probability estimates as well as the eight subsequent iterations – the sum of their probability estimates for the five suspects was always equal to 100%. With these results as strong confirmation of previous research findings, the research questions focus on the contrasting effects of the different response modes.
Comparison of Belief Revision Patterns amongst the Three Conditions Examination of the probability revision patterns of the participants in the different cells shows marked differences. The participants who received tutoring in probability theory (and who all complied with complementarity throughout the revisions) frequently engaged in the non-Bayesian practice of eliminations and resuscitations. Specifically, of the 35 participants in the linear-scale tutored condition, 28 (80%) engaged in resuscitations. The number of times these participants engaged in resuscitations ranged from one to seven. In some cases there were multiple eliminations, i.e., a participant eliminated more than one suspect (at the same or different iterations) and subsequently resuscitated them (at the same or different iterations). None of the participants in the circle graph or linear probability elicitation scale
99
Elicitation Methods for Probabilistic Multiple Hypothesis Revision 221 219 217 215
Per Cent
213 211 209 207 205 203 201 199 197 195 0 Initial
#1
#2
#3
#4
#5
#6
#7
#8
Items of Evidence
Fig. 1.
Profile of Mean Probability Estimates for Linear Non-Tutored Condition at Each Iteration.
Table 3.
Frequency of Resuscitations by Condition.
Revision Pattern
Condition Circle-Graph
Linear-Scale Nontutored
Linear-Scale Tutored
0 35
0 35
28 7
Resuscitations No resuscitations Note: w2 ¼ 76.364, p ¼ 0.000.
nontutored conditions engaged in any resuscitations. A w2 test yields a value of 76.364, pr0.001.5 Table 3 below summarizes the results by group. The earliest iteration at which a suspect was eliminated by a participant in the linear probability elicitation scale tutored condition was iteration number one. That suspect was subsequently reinstated at iteration three. There were numerous instances of eliminations and subsequent resuscitations from
100
CRAIG EMBY
iteration number two onwards. Iteration-by-iteration analysis of the responses of the participants in the linear-scale tutored condition showed that while the resuscitations involved some suspects more than others, all of the suspects experienced some incidence of the phenomenon. The most frequently eliminated and resuscitated suspect was John Sims, chief warehouseman. The first item of evidence about this suspect was disconfirming (it was intended to reduce the participants’ belief in the probability of his guilt). However, the second-most eliminated and resuscitated suspect was Susan Evans, salesperson, and the first item of evidence concerning her was confirming. The more or less randomness of the eliminations and resuscitations supports the interpretation that this was not evidence-driven but was a strategy to reduce the cognitive cost of the hypothesis revision process, particularly in the case of participants who did so multiple times. In each of the other two conditions, there were eliminations of suspects. However, in every case where there was an elimination, it was permanent. There were no resuscitations. In the linear probability elicitation scale nontutored condition, the earliest elimination was at iteration number seven; in the circle-graph condition, it was at iteration number six. In both conditions, there were instances where suspects were reduced to low levels of probability and subsequently re-evaluated to higher levels; but, there seemed to be a qualitative recognition that elimination of a suspect was a final rather than intermediate judgment. Another observed difference in the patterns of revision behavior of the auditors in the different conditions relates to the question of whether or not hypotheses are being evaluated independently or whether there is recognition of the dependence of the set of competing hypotheses. While the axioms of probability mandate complementarity in probability revision, the metrics of that revision process are not as clear-cut. One heuristic metric that has some intuitive appeal is as follows: if the new piece of evidence has implications for one and only one of the alternatives (the ‘‘target hypothesis’’), and if as a result the probability of the target hypothesis is increased (decreased), a corresponding amount may be taken from (allocated to) each of the competing hypotheses in proportion to their probabilities immediately prior to the revision of the target hypothesis. Participants were not expected to perform their revisions following such metrics. The calculations involved over eight iterations would be extensive to say the least. However the pattern of revisions was examined for qualitative consistency with such an approach to revision – specifically to how many other suspects was a change in probability estimation of the target hypothesis distributed?
Elicitation Methods for Probabilistic Multiple Hypothesis Revision
101
Again there was a substantial difference amongst the three conditions. The participants in the linear-scale nontutored condition did not adhere to complementarity and their revisions to the probabilities of nontarget hypotheses were relatively infrequent. In the majority of iterations (66%), the participants in the linear-scale tutored condition revised the probability of only one nontarget hypothesis to compensate for the probability revision to the target hypothesis. In contrast, in over 65% of the iterations, the participants in the circle-graph condition distributed the change in the probability estimation of the target hypothesis to at least three of the four alternative hypotheses. Table 4 shows the revision behavior patterns of the participants in the three conditions over the eight iterations. A w2 test on the patterns summed over the eight iterations (omitting the first row because of the two zero-value cells) yields a value of 196.404, pr0.001. Consistent with the idea expressed in the earlier-quoted statements of Bowman (1968, p. 173) and Chernoff (1978, p. 4), the circle-graph approach apparently promoted a more comprehensive consideration of the alternative hypotheses and the implications of the evidence for those hypotheses.
Final Overall Judgments Two additional measures were collected in the experiment relevant to the question of the differential effects of different response modes. The first measure relates to the question of the degree of confidence that the participants had in their final overall judgments of the probability of guilt of the set of suspects. If, as suggested by Bowman (1968) and Chernoff (1978), the circle-graph response mode fosters a more balanced and unified view, the participants using the circle-graph response mode may feel more confidence in their final judgments of the set of suspects.6 Because of the reduced cognitive demands of using the circle graphs to comply with completeness and complementarity, the circle-graph response mode might be perceived as less difficult. Conversely, the increased cognitive demands of maintaining completeness and complementarity might ‘‘get in the way;’’ and, participants in the linear-scale tutored condition would be less confident in their final judgments. The linear scale was expected to be perceived as more difficult to use. Confidence scores were elicited on a 7-point Likert scale anchored by ‘‘1 – very unconfident’’ to ‘‘7 – very confident.’’ The order of the mean confidence scores was circle-graph condition (X ¼ 4.654)Wlinear-scale nontutored condition (X ¼ 4.220)Wlinear-scale tutored condition (X ¼ 3.463). A 3 1
102
Probability Revision Patterns of the Three Conditions.
Table 4.
Iteration Number 1
2
3
4
5
6
7
8
Total
CG LSNT LST CG LSNT LST CG LSNT LST CG LSNT LST CG LSNT LST CG LSNT LST CG LSNT LST CG LSNT LST CG LSNT LST 33 Number of times probability of target hypothesis revised
29
32
30
31
Number of times revision distributed to 0 other 0 17 0 0 18 suspects 1 other 6 1 9 6 6 suspects 2 other 7 8 15 9 4 suspects 3 other 16 0 8 12 3 suspects 4 other 4 3 0 3 0 suspects 29
32
30
31
27
28
22
31
29
33
30
31
27
33
34
31
31
33
25
31
30
0
0
21
0
0
16
0
0
21
0
0
20
0
0
21
0
0
7
0
22
2
6
17
4
8
24
7
7
21
5
8
21
4
8
18
3
9
9
6
1
5
11
3
8
8
2
6
5
5
8
7
2
6
5
1
14
0
0
14
0
1
9
0
0
18
0
1
15
0
1
0
5
0
0
2
2
0
6
1
0
5
1
1
5
2
32
27
28
22
31
29
33
30
31
27
33
34
31
31
33
w2(6) on summary data ¼ 196.40, pr0.001. Abbreviations: CG, circle graph; LSNT, linear-scale nontutored; LST, linear-scale tutored.
34 246
245
236
0
141
0
23
37
53
155
5
6
58
30
63
19
8
3 117
11
15
0
4
1
2
34
10
3
25
31
30
34 246
245
236
CRAIG EMBY
33
32
103
Elicitation Methods for Probabilistic Multiple Hypothesis Revision
Table 5. ANOVAs on Confidence and Difficulty Scores by Condition. Source
Sum of Squares
Degrees of Freedom
Mean Square
F
Significance
Panel A: 3 1 ANOVA on confidence scores Between groups 25.449 2 142.005 102 Within groups 167.454 104 Total
12.725 1.392
9.140
0.000
Panel B: 3 1 ANOVA on difficulty scores Between groups 44.932 2 202.231 102 Within groups 247.162 104 Total
22.466 1.983
11.331
0.000
Note: Post hoc tests showed that the linear-scale tutored condition was significantly different (at pr0.05) from the linear-scale nontutored and circle-graph conditions, which were not statistically different from each other.
ANOVA shows a significant difference across the three conditions (F ¼ 9.140, pr0.001). Post hoc tests showed that the linear-scale tutored condition was significantly different ( pr0.05) from the linear-scale nontutored and circle-graph conditions, which were not statistically different from each other. Panel A of Table 5 shows the results of the ANOVA on confidence scores. At the end of the experiment, participants were asked to rate the difficulty of the task. Difficulty scores were elicited on a Likert scale anchored by ‘‘1 – very difficult’’ and ‘‘7 – not difficult.’’ The order of the mean confidence scores was linear-scale tutored condition (X ¼ 4.723)Wlinear-scale nontutored condition (X ¼ 3.677)Wcircle-graph condition (X ¼ 3.149). A 3 1 ANOVA shows a significant difference across the three conditions (F ¼ 11.331, pr0.001). Post hoc tests showed that the linear-scale tutored condition was significantly different ( pr0.05) from the linear-scale nontutored and circlegraph conditions, which were not statistically different from each other. Panel B of Table 5 shows the results of the ANOVA on difficulty scores. Additional Analysis Another effect associated with the linear-scale tutored response mode observed in the course of coding and transcribing the participants’ responses supports the contention that the participants in that condition were influenced in their probability revisions by the response mode. Many participants in that condition made their re-estimates in multiples of 5%
104
CRAIG EMBY
throughout the eight iterations – i.e., every one of the 45 estimates and reestimates (5 suspects 9 iterations) ended in a zero or a five. It seems reasonable to suggest that it was a strategy to simplify the arithmetic required by the revision process, rather than a reflection of their true probability estimates at each revision. Neither of the other two groups used a similar strategy.
SUMMARY AND CONCLUSIONS This study examined the nature of MHR by auditors in a fraud investigation setting. The results of the auditors who used the circle-graph response mode indicate that at a qualitative level, their revisions were more consistent with the complementarity concept of axiomatic probability and recognized the interrelatedness within a set of competing hypotheses. This is consistent with normative standards, and in this context, implies a more efficient MHR strategy. This may have implications for audit practice. For instance, providing auditors with training and/or decision aids to promote and reinforce the probabilistic perspective may be beneficial. An interactive computerized decision aid with a visual component such as the circle graphs used here, or some form of bar-graph, could be used to help the auditor work through the consequences for all alternatives of adjusting the probability of a target hypothesis. The cumulative results of this paper suggest that this could bring increases in both efficiency and effectiveness in MHR tasks. Consideration of the responses of the participants in the linear-scale nontutored condition may provide an alternative explanation for the observed phenomenon characterized as independence in hypothesis revision. By revising the probability of one hypothesis, the ‘‘target hypothesis’’ (say, upwards) and not revising the probabilities of the nontarget hypotheses downwards, the decision maker has in fact, implicitly recognized that the nontarget hypotheses have become relatively less likely. If one were to renormalize the responses after every iteration, the results are qualitatively consistent with the concept of complementarity. Quantitatively, the responses are not correct (in particular the new value assigned to the target hypothesis may be significantly underestimated compared to the metrics of the PJP), but qualitatively, the revisions are consistent with the principles of the PJP. The participants in the linear-scale nontutored condition followed this pattern in 57% of their revisions. Normative probabilistic revision is not the only approach to belief revision. However, in this study, in circumstances that satisfy the constraints
Elicitation Methods for Probabilistic Multiple Hypothesis Revision
105
within which probabilistic revision is normatively appropriate and after reducing the arithmetic complexity of the revision process, the participants displayed a much greater degree of compliance, qualitatively, with the norms of probabilistic revision. Nor did they violate Bayes’ Theorem by engaging in eliminations and resuscitations, suggesting that the concept of zero probability has an intuitive meaning. They distributed a change in the probability of the target hypothesis to alternative hypotheses in such a manner as to suggest that they have an intuitive understanding of the interrelation between alternative hypotheses and the implications of new evidence for that interrelationship.
LIMITATIONS AND SUGGESTIONS FOR FUTURE RESEARCH A limitation that must be recognized in interpreting the results of this experiment is that the experimental design was deliberately structured to eliminate ambiguity. In a real-world situation, the circumstances may be much less clear (see Srivastava et al., 2002 for an example). While a situation mirroring the characteristics of the experimental design is possible in practice, in many auditing contexts, such as analytical review, the competing hypotheses may not be mutually exclusive (or even exhaustive); and, a piece of evidence may have direct implications for more than one hypothesis. Thus, there is no suggestion that the evidence of this study can be generalized to situations containing these more complex relations amongst hypotheses and evidence. However, these inferences are valid in the set of circumstances represented herein. A limitation of this, and indeed almost all experimental research, is that one cannot properly make valid inferences about participants’ thought processes based on observations of their behavior. In this experiment, the auditors in the linear-scale untutored condition behaved in a manner that appears to be more consistent with their treating the alternatives as a nonmutually-exclusive set. The auditors in the circle-graph condition appeared to behave in a manner that was qualitatively consistent with probability revision and Bayes’ Theorem. Interpreting the behavior of some of the participants in the linear-scale nontutored condition as consistent is also possible, as has been suggested above. Nevertheless, what one would really like to know is what the participants were thinking as they made their estimations and revisions. To do that, future research using some form of
106
CRAIG EMBY
protocol analysis may offer that best opportunity for understanding the thought processes behind the behavior.
NOTES 1. The average change in probability rating of the target hypothesis in the piechart condition was 9.49%. 2. There is nothing inherent in the nontutored linear-scale response condition, nor in the circle-graph response condition, that prevents those participants from engaging in eliminations and resuscitations. They may decide at some intermediate point that the probability of a suspect is so low as to warrant elimination, and then subsequently change their minds. 3. As Asare and Wright (1997a, 1997b) point out, there are many other sets of circumstances where the hypotheses are not independent and/or the items of evidence may pertain to more than one alternative – in such circumstances Bayesian revision is not the normatively appropriate model. There are also alternative models of MHR that do not require the decision maker to adhere to the axioms of Bayes Theorem (Teigen, 1983; Srivastava & Shafer, 1992; Van Wallandael, 1989; Van Wallandael & Hastie, 1990). 4. The purpose of the combination of pre-experimental instructions and reminders throughout was to ensure, as far as possible, that the participants in the linear scale tutored group did comply with the norms of probabilistic revision. The first objective was to examine whether adherence to the norms of probabilistic revision resulted in them displaying any other behaviors in their revisions – and then to contrast those other behaviors with the revision patterns of the participants in the pie-chart conditions. This does create a confound in assessing the results of the preexperimental instructions to the linear tutored group and is a limitation on any statements as to the efficacy of this treatment. 5. Given the extreme distribution in the table, the w2 value is no doubt inflated, but it seems reasonable to conclude that there is a significantly different distribution amongst the conditions. 6. The relationship of confidence to judgment quality is not clear. The two have been linked in the auditing literature (e.g., Weber, 1980); and Shanteau (1992) suggests that confidence is a necessary condition for expertise. On the other hand, the testimony literature has not found a positive correlation between confidence and accuracy. These results do not speak to this issue directly but Shanteau (1989) suggests that the increased levels of confidence in a judgment may be an intangible factor that contributes to increased utility to the decision maker.
ACKNOWLEDGMENTS The author thanks Michael Favere-Marchesi, Irene Gordon, Peter Clarkson and Graham Dover for their helpful comments. He also thanks the Institute
Elicitation Methods for Probabilistic Multiple Hypothesis Revision
107
of Chartered Accountants of British Columbia for financial support for this research.
REFERENCES Anderson, J. C., & Kaplan, S. (1992). An investigation of the effect of presentation format on auditors’ noninvestigation region judgments. Advances in Accounting Information Systems, 1, 71–88. Anderson, J. C., & Reckers, P. (1992). An empirical investigation of the effects of presentation format and personality on auditors’ judgment in applying analytical procedures. Advances in Accounting, 10, 19–43. Asare, S. K., & Wright, A. (1995). Normative and substantive expertise in multiple hypothesis evaluation. Organizational Behavior and Human Decision Processes, 64(3), 171–184. Asare, S. K., & Wright, A. (1997a). Hypothesis revision strategies in conducting analytical procedures. Accounting, Organizations and Society, 22(8), 737–755. Asare, S. K., & Wright, A. (1997b). Evaluation of competing hypotheses in auditing. Auditing: A Journal of Practice and Theory, 16(1), 1–13. Bhattacharjee, S., Kida, T., & Hanno, D. (1999). The impact of hypothesis set size on the time efficiency and accuracy of diagnostic decision makers. Journal of Accounting Research (Spring), 83–100. Bowman, W. J. (1968). Graphic Communication. New York, NY: Wiley. Cardinaels, E. (2008). The interplay between cost accounting knowledge and presentation formats in cost-based decision making. Accounting, Organizations and Society, 33(6), 551–581. Chernoff, H. (1978). Graphical representations as a discipline. In: P. C. C. Wang (Ed.), Graphical representation of multivariate data (pp. 1–12). New York, NY: Academic Press. Dutta, S. K., & Srivastava, R. P. (1993). Aggregation of evidence in auditing: A likelihood perspective. Auditing: A Journal of Practice and Theory, 12(Supplement), 137–160. Gavanski, I., & Hui, C. (1992). Natural sample spaces and uncertain belief. Journal of Personality and Social Psychology, 63(November), 766–780. Gnanadesikan, R. (1980). Graphic data analysis: Issues, tools, and examples. Presented at the annual meeting of the American Association of Advertising Science, San Francisco, CA. Heiman, V. (1990). Auditors’ assessments of the likelihood explanations in analytical review. The Accounting Review, 65, 875–890. Ives, B. (1982). Graphical user interfaces for business information systems. Management Information Systems Quarterly (Special Issue), 15–47. Jarvenpaa, S. L., & Dickson, G. (1988). Graphics and managerial decision making: Researchbased guidelines. Management of Computing, 31(6), 764–774. Remus, W. (1984). An empirical investigation of the impact of graphical and tabular data presentations on decision making. Management Science (May), 533–542. Remus, W. (1987). A study of graphical and tabular displays and their interaction with environmental complexity. Management Science (September), 1200–1204. Robinson, L., & Hastie, R. (1985). Revision of beliefs: When a hypothesis is eliminated from consideration. Journal of Experimental Psychology: Human Perception and Performance, 11(August), 443–456.
108
CRAIG EMBY
Schulz, A. K-D., & Booth, P. (1995). The effects of presentation format on the effectiveness and efficiency of auditors’ analytical review judgments. Accounting and Finance, 35(1), 107–131. Shafer, G. (1976). A mathematical theory of evidence. Princeton, NJ: Princeton University Press. Shafer, G., & Srivastava, R. P. (1990). The Bayesian and belief-function formalisms: A general perspective for auditing. Auditing: A Journal of Practice and Theory, 9, 110–137. Shanteau, J. (1989). Cognitive heuristics and biases in behavioral auditing: Review, comments, and observations. Accounting, Organizations and Society, 14(1/2), 163–177. Shanteau, J. (1992). Competence in experts: The role of task characteristics. Organizational Behavior and Human Decision Processes, 53(2), 252–266. Srivastava, R. P., & Shafer, G. (1992). Belief-function formulas for audit risk. The Accounting Review, 67, 249–283. Srivastava, R. P., Wright, A., & Mock, T. J. (2002). Multiple hypothesis evaluation in auditing. Accounting and Finance, 42(3), 251–277. Teigen, K. H. (1974a). Subjective sampling distributions and the additivity of estimates. Scandinavian Journal of Psychology, 15, 50–55. Teigen, K. H. (1974b). Overestimation of subjective probabilities. Scandinavian Journal of Psychology, 15, 55–62. Teigen, K. H. (1983). Studies in subjective probability III: The unimportance of alternatives. Scandinavian Journal of Psychology, 24, 97–105. Torgerson, W. S. (1958). Theory and methods of scaling. New York, NY: Wiley. Van Wallandael, L. (1989). The quest for limits on non-complementarity in opinion revision. Organizational Behavior and Human Decision Processes, 43, 385–405. Van Wallandael, L., & Hastie, R. (1990). Tracing the footsteps of Sherlock Holmes: Cognitive representations of hypothesis testing. Memory and Cognition, 18, 240–250. Vessey, I. (1991). Cognitive fit: A theory-based analysis of the graphs versus tables literature. Decision Sciences, 22, 219–240. Waller, W. (1994). A behavioral-economics approach to auditors’ risk assessments. In: R. Srivastava (Ed.), Proceedings of the 1994 Deloitte & Touche/University of Kansas Symposium on Auditing Problems. Lawrence, KS: University of Kansas. Weber, R. (1980). Some characteristics of the free recall of computer controls by EDP auditors. Journal of Accounting Research (Spring), 214–241. Wright, P. (1974). The harassed decision maker: Time pressures, distractions and the use of evidence. Journal of Applied Psychology, 59, 555–561.
THE PARADOXICAL EFFECTS OF FEEDBACK AND REWARD ON DECISION PERFORMANCE Siew H. Chan, Steve G. Sutton and Lee J. Yao ABSTRACT While the use of computerized decision aids in accounting is widespread, little is known about the effects of decision aids on accounting decision making. However, prior research has often noted the difficulty in getting users to accept and rely upon decision aids (Rose, 2002). A primary area of focus in the design of decision aids that will facilitate user acceptance and reliance has been the development of user-centered interfaces that increase the user’s comfort with the aid. This study contributes to this body of research by extending the findings of Ryan, Mims, and Koestner (1983) on the use of informational versus controlling rewards to the context of a decision aid and the interface design. While Ryan et al. focused on the effects of verbal feedback on intrinsic motivation, this study focuses on the impact of text-based feedback from a decision aid on decision performance for a choice task. Additionally, this study examines the effect of task-contingent versus performance-contingent rewards on the impact of the decision aid feedback. The results indicate a differential effect from that of Ryan et al. (1983) when feedback is provided through a decision aid and the focus is on decision performance rather than the precursor condition of intrinsic motivation. Additional research is needed to help explain why the findings obtained by Ryan et al. do not hold in the Advances in Accounting Behavioral Research, Volume 12, 109–143 Copyright r 2009 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1108/S1475-1488(2009)0000012008
109
110
SIEW H. CHAN ET AL.
context of computerized decision aid use when decision performance is measured directly. There are important implications of these findings both in terms of theory development and decision aid design in professional decision-making environments such as accounting.
INTRODUCTION The use of decision aids in accounting environments is recognized as being widespread, but not always with the intended impact on decision-making processes (Rose, 2002). For instance, mandatory decision aid use is common across all of the major international audit firms (Dowling & Leech, 2007); but, despite efforts to structure such technologies, users in these firms also routinely adapt the decision aids for use in unintended manners (Dowling, 2009). Research shows that the way users apply such decision aids is impacted by both the design of the aid and socialization pressures exerted at the firm, office and team levels (Dowling, 2009). Accordingly, a primary concern in the design of decision aids for facilitating user judgment is this reluctance users can exhibit in regards to acceptance and use of the aid (Davis, 1989). In response, researchers have advocated the need to consider users in the design of their systems (Sugar, 2001; Tzeng, 2004), believing that usercentered approaches to design will shape the way individuals use systems (Graham, Kennedy, & Benyon, 2000). As such, systems usability has become a critical objective in interface designs as it plays a significant role in increasing the positive affect that users derive from interacting with the system (Johnson, Gardner, & Wiles, 2004; Johnson & Wiles, 2004). While debated among researchers (see Johnson, Marakas, & Palmer, 2006), the general belief is that interaction with the system is enhanced by usage of human-like characteristics in designing the systems interface (e.g. Reeves & Nass, 1996; Johnson et al., 2006). This use of human-like characteristics can have unintended (or intended) consequences as the computer interface assumes a certain ‘‘style, diction, and tone of voice which impact upon the [user’s] attitude and response toward the [system]’’ (Shirk, 1988, p. 320). In this respect, designers can impact user’s response to a system in part by the nature of the feedback provided through systems applications with positive feedback theorized to result in positive affect and, likewise, favorable user attitude toward a system (Johnson et al., 2004). Feedback has accordingly become one of the central foci in studying users’ reaction to systems interfaces (e.g. Fogg & Nass, 1997; Johnson et al., 2004; Tzeng, 2004).
Effects of Feedback and Reward on Decision Performance
111
Beyond simply making the user comfortable or even happy, highly interactive interfaces, and in particular those that provide praise, flattery, or even apology upon poor outcomes (Fogg & Nass, 1997; Tzeng, 2004), have been shown to enhance user motivation (Kettanurak, Ramamurthy, & Haseman, 2001). This finding is further supported by Pace (2004) who found that enjoyment in the context of flow experienced by Web users resulted in intrinsic rewards from the activity. Still, Pace notes the paucity of research examining motivation within the context of systems use and design. Given the perceived relationship between feedback type and interface generated incentives on a user’s intrinsic motivation, this body of research is extended in an effort to develop a more complete picture of the effects of feedback type and incentives on user decision performance in the presence of a useful decision aid. The design of the study is based on Ryan, Mims, and Koestner (1983) seminal work on the interrelationship between feedback type and reward type on intrinsic motivation in a human–human interaction environment. This study adopts Ryan’s dichotomy between informational and controlling feedback in the implementation of systems interfaces designed to provide feedback to users in a human–computer interaction environment. The aid is provided to participants within the context of alternative incentive environments (using Ryan et al.’s dichotomy between performance-contingent and task-contingent incentives) to understand the effect of decision aid provided feedback on users’ performance. Additionally, the primary emphasis is on enhancing interfaces to incentivize user acceptance and use of the aid due to the need to improve user performance through aided decision making. Thus, this study focuses on the effects of feedback and incentives on user decision performance as opposed to focusing on the intermediary step of intrinsic motivation to perform as studied by Ryan et al. (1983). The research presented here is important to decision aid development in accounting and other professional decision-making environments for several reasons. First, it introduces Ryan et al.’s (1983) seminal work on feedback and incentives into the dialog on systems interface design. Second, the study investigates the impact on Ryan et al.’s findings when feedback is provided through a decision aid’s text-based interface rather than verbally through a human. Third, the study focuses on the end goal of improved decision performance rather than focusing on the intermediary stage of intrinsic motivation that should lead to enhanced decision performance. Simply put, happy and engaged users are only of benefit in a decision aid environment if such users are also more effective at the decision task being supported.
112
SIEW H. CHAN ET AL.
The remainder of the paper is organized as follows. The next section outlines the theory in light of prior research findings and formalizes the hypotheses related to feedback, incentives, and the interaction of feedback and incentives respectively. The following section presents the research method applied, while the fourth section presents the results of the experimental study. The remaining sections address the implications, limitations, and conclusions.
THEORY AND HYPOTHESES Feedback: Informational versus Controlling As noted earlier, the use of human-like communication in the design of a system’s interface is believed to facilitate interaction with the user and likely enhance acceptance and use of the system (e.g. Reeves & Nass, 1996; Johnson et al., 2006). At the same time, by using human-like communication, the message conveyed by the systems interface will be affected by the manner in which the user interprets the semantics surrounding the message, such as the style and tone of voice (Shirk, 1988). This style and tone of voice are key components that help shape the feedback presented through a systems interface beyond just the intended message content. Several researchers have undertaken exploration of the impact of various types of message presentation on systems users’ behavior (Fogg & Nass, 1997; Johnson et al., 2004; Johnson et al., 2006; Tzeng, 2004). Fogg and Nass (1997) focused on the use of ‘‘sincere’’ praise, ‘‘flattery’’ (i.e., insincere praise) and generic feedback – the sincere and flattery forms were perceived to be more positive. The results of the study led the authors to suggest that adding such positive feedback to training and tutorial software could increase user enjoyment, task persistence, and self-efficacy. Additionally, Fogg and Nass (1997) posit that the positive feelings provided by the positive feedback engage the users and lead to greater success in systems usage (e.g., performance). Tzeng (2004) uses a similar type of strategy to address failure experienced when using a system to alleviate the negative reactions to systems use that come from debilitated use of the system. The feedback from the system was examined in the context of ‘‘apologetic’’ versus ‘‘non-apologetic’’ presentation. As anticipated, the apologetic feedback provided by the system created a favorable experience for the users (Tzeng, 2004). The results add to the body of research suggesting that systems interface designers should be
Effects of Feedback and Reward on Decision Performance
113
conscious of the need to create favorable user perceptions of systems to increase positive user experience that lead to increased systems use and improved decision performance. In this study, cognitive evaluation theory is used to enhance understanding related to feedback as a decision aid characteristic. Cognitive evaluation theory posits that events can be classified as either informational or controlling. Individuals receive informational feedback when they receive information about their competency at a task in a self-determined performance context (Ryan et al., 1983). Individuals receive controlling feedback when they experience pressure toward the attainment of specific outcomes (e.g., attaining a specified level of performance). While informational feedback promotes intrinsic motivation, controlling feedback undermines intrinsic motivation (Ryan et al., 1983). Individuals are more intrinsically motivated when they expect an informational rather than a controlling evaluation (Shalley & Perry-Smith, 2001). Prior studies have administered feedback either in an informational or controlling manner (e.g. Ryan, 1982, Ryan et al., 1983). Their results support the theorized effects because intrinsic motivation was lower in the controlling feedback condition than in the informational feedback condition. In a decision aid environment, the context is different than either a human–human interaction (as in Ryan et al.’s study) or a systems environment where the focus is simply on task completion as opposed to support of a user’s decision process. In a decision aid environment, the concern is less with a happy or contented user and more on the desire to improve decisionmaking performance. While getting the user to accept and use the system is critical and the nature of the supportiveness of the feedback important, some form of good feedback will assist the decision maker in performance improvement. Hence, consistent with Ryan et al. (1983), this research posits that greater intrinsic motivation generated by informational as opposed to controlling feedback will lead to greater performance.1 Further, any feedback perceived as useful to the decision maker should improve decision performance – providing a potential design characteristic of importance for decision aids. In particular, Harackiewicz and Sansone (2000) note that a person receiving feedback on competence will become more interested in an activity because of enhanced feelings of competence and will accordingly exert more effort to improve performance. This is consistent with Ryan et al. (1983) who reported a marginal effect for feedback when they compared all feedback groups with the no-feedback groups. Ryan et al. concluded that regardless of the type of feedback, some feedback was better than none.
114
SIEW H. CHAN ET AL.
This leads to the first set of hypotheses related to feedback: H1a. Individuals who use a decision aid with informational feedback will perform better than those who use a decision aid with no feedback. H1b. Individuals who use a decision aid with controlling feedback will perform better than those who use a decision aid with no feedback. H1c. Individuals who use a decision aid with informational feedback will perform better than those who use a decision aid with controlling feedback. H1d. Individuals who use a decision aid with some type of feedback will perform better than those who use a decision aid with no feedback. Prior research suggests that outcome feedback is important, but provides less guidance as to how that feedback should be provided. Stone (1995) found that outcome feedback led to better performance and greater selfinsight into decision performance, while process feedback led to poorer performance but greater self-insight into decision processes. Tuttle and Stocks (1997) found that individuals confused their actual decision models with task information, but upon further study found that performance improved when outcome feedback was present (Tuttle & Stocks, 1998). Importantly from a decision aid standpoint, Gibson (1994) showed that feedback types, not screen layouts, had a significant effect on performance. Replication of Ryan et al.’s (1983) work on informational versus controlling feedback, placed within the context of a decision aid environment, provides a foundation for better understanding on the results related to feedback in these other studies. Additionally, an improved understanding on the effects of informational versus controlling feedback should assist designers in improving systems interfaces.
Reward: Task-Contingent versus Performance-Contingent Factors affecting motivation, and thus effort and performance, are difficult to consider without also considering the reward structures that are in place for effort and performance. Individuals are said to be intrinsically motivated when they experience interest, excitement, and enjoyment in the performance of an activity (Deci, 1992). While rewards are primarily viewed as necessary
Effects of Feedback and Reward on Decision Performance
115
to provide extrinsic motivation, a meta-analysis of 128 well-controlled experiments examining the relationship between reward and intrinsic motivation revealed significant and consistent negative impact of reward on intrinsic motivation for interesting activities (Deci, Koestner, & Ryan, 1999). This effect may be due to reward-oriented individuals being more directed toward goal-relevant stimuli; and, the rewards actually divert such individuals’ attention away from the task and environmental stimuli that might promote more creative performance (Amabile, 1983). Condry (1977, pp. 471–472) suggests that rewarded individuals ‘‘work harder and produce more activity, but the activity is of a lower quality, contains more errors, and is more stereotyped and less creative than the work of comparable nonrewarded subjects working on the same problems.’’ On the other hand, there are many positive effects on performance derived generally from the introduction of extrinsic rewards. Extrinsic rewards may motivate individuals to spend more time on a task (e.g. Awasthi & Pratt, 1990) and influence focus on task performance (e.g., Klein, Goodhue, & Davis, 1997). Ryan et al. (1983) apply the same cognitive evaluation theory used to explain the effects of feedback to explain the effects that rewards can have on individuals’ behavior. In essence, Ryan et al. view rewards as another type of feedback mechanism. In laying out the foundation for the theoretical impact of rewards, Ryan et al. classify rewards into three categories: task noncontingent, task-contingent, and performance-contingent rewards. Task noncontingent rewards are provided simply for doing a task, without consideration of engagement in the task (Deci et al., 1999). An example of a task noncontingent reward is providing a gift for participation without regard for how participants perform during the experiment (Deci, 1972). Task noncontingent rewards are not expected to affect intrinsic motivation because individuals are not required to perform well in the activity, complete the task, or even perform the task (Deci et al., 1999). Three meta-analyses performed by Deci et al. (1999), Tang and Hall (1995), Cameron and Pierce (1994) all concluded that task noncontingent rewards do not have a significant impact on intrinsic motivation. Task-contingent rewards are administered only after an individual actually performs a task (Deci et al., 1999) and can be divided into two subcategories: completion-contingent and engagement-contingent. Completion-contingent rewards are provided only upon explicit completion of the target activity; engagement-contingent rewards are administered simply for engagement in the task, without consideration of completion of the task. An example of a completion-contingent study was one where Deci (1971) asked individuals to work on four variations of a three-dimensional puzzle and
116
SIEW H. CHAN ET AL.
gave them $1 for each puzzle completed in the specified time. An example of an engagement-contingent reward is telling the participants that they would receive a reward for engaging in a series of hidden-figures puzzles (Ryan et al., 1983). These individuals were unaware of whether they had done well or completed the activity because they had no knowledge of the number of hidden figures in each drawing (Deci et al., 1999). Both completioncontingent and engagement-contingent rewards have about the same level of undermining effect (i.e. negative effect) on free-choice intrinsic motivation and self-reported interest (Deci et al., 1999). Performance-contingent rewards are administered for superior performance in the activity; that is, rewards are either a direct function of actual performance success (e.g. an 80% accuracy rate in a decision task resulting in 80% of the maximum possible reward) or achievement of a specific standard (e.g., performed better than 80% of the other participants or achieved at least an 80% accuracy rate on a decision task). Ryan et al. (1983) theorize that the context in which performance-contingent rewards are administered can facilitate or debilitate intrinsic motivation. This variance is based on the cognitive evaluation theory by considering the salient informational or controlling aspect of a reward that mediates the effect of that reward. Thus, performance-contingent rewards can have a positive or negative impact on intrinsic motivation. Intrinsic motivation is maintained or increased if the performance-contingent reward is perceived to provide competence information; in contrast, intrinsic motivation is impaired if the reward is used to control how well a person does in a task (Ryan & Deci, 2000). The interpersonal context in which performancecontingent rewards are administered can convey either competency or pressure to do well in an activity (Ryan et al., 1983). In short, informational administration of performance-contingent rewards lead to increased intrinsic motivation while controlling administration of such rewards lead to decreased intrinsic motivation (Harackiewicz, 1979; Ryan et al., 1983). In light of these theorized relationships, differences in performance would be expected from decision makers using a decision aid based on the reward structure in place. When offered a task-contingent reward, the decision maker may view this as overjustification, creating an undermining effect in comparison to the intrinsic motivation that would be present when no reward is present (e.g., Deci, 1972; Eisenberger & Cameron, 1996; Lepper, Greene, & Nisbett, 1973; Ryan & Deci, 1996; Sansone & Harackiewicz, 1998). This undermining effect arises when individuals are rewarded for doing an interesting activity. The response to the reward is generally for
Effects of Feedback and Reward on Decision Performance
117
individuals to exhibit less interest in, and willingness to, work on the activity (Deci & Ryan, 1987). This leads to the next hypothesis: H2a. Individuals who receive no reward will perform better than individuals who receive a task-contingent reward. Performance-contingent rewards can be more controlling, demanding, and constraining than task-contingent rewards because a particular standard of performance is expected. This leads to greater pressure and subsequent larger decrements in intrinsic motivation than in conditions where task-contingent rewards are administered (Harackiewicz & Sansone, 2000). Performancecontingent rewards have also been shown to undermine intrinsic motivation and performance (e.g., Boggiano & Ruble, 1979; Daniel & Esser, 1980; Ryan et al., 1983). However, performance-contingent rewards may lead to higher performance when individuals are motivated to work harder and put in more effort than they otherwise would (Harackiewicz & Sansone, 2000); therefore, performance-contingent rewards may be effective for improving decision performance (Lepper, 1981). Performance-contingent rewards provide information about competence at a given activity and motivate a person to care about doing well (Harackiewicz & Manderlink, 1984). When a person is offered a performance-contingent reward, she may strive to demonstrate competence and treat the activity as an achievement task (e.g., Dweck, 1986; Harackiewicz, Abrahams, & Wageman, 1987; Nicholls, 1984). A metaanalysis by Deci et al. (1999) indicated that performance-contingent rewards impaired the behavioral measure of intrinsic motivation but not the selfreported measure. This is contradicted by a meta-analysis conducted by Eisenberger, Pierce, and Cameron (1999) suggesting that performancecontingent rewards have a null and positive impact on the behavioral measure of intrinsic motivation and positive impact on the self-reported measure (Harackiewicz & Sansone, 2000). This study is different from existing motivation studies in that it examines the impact of the type of reward on decision performance directly. As such, the potentially minimal detrimental effects of performance-contingent rewards on intrinsic motivation should be balanced by the extrinsic motivation generated by the reward. The related hypotheses are stated as follows: H2b. Individuals who receive a performance-contingent reward will perform better than those who receive no reward. H2c. Individuals who receive a performance-contingent reward will perform better than those who receive a task-contingent reward.
118
SIEW H. CHAN ET AL.
Replication of Ryan et al.’s (1983) work on task-contingent versus performance-contingent rewards, placed within the context of a decision aid environment, provides a foundation for better understanding on the effects of reward structures on decision aid usage behavior. However, given the focus on decision performance in a decision aid environment, the anticipated effects of task-contingent rewards on decision performance are different from the anticipated effects of task-contingent rewards on intrinsic motivation.
Interactive Effect of Feedback and Reward The importance of considering the joint effect of feedback and reward is outlined by Ryan et al. (1983). In short, reward structures have informational and controlling attributes as perceived by the individuals subject to the reward, and these informational and controlling attributes commingle with the informational and controlling nature of feedback. The perceptions of reward structures can be heavily influenced by the nature of feedback with informational feedback highlighting the informational aspects of a reward structure, and controlling feedback highlighting the controlling aspects of a reward structure. Ryan et al. (1983) examined the impact of feedback and reward on intrinsic motivation. They found that compared to the no-feedback/no-reward condition, intrinsic motivation was higher under the informational feedback/performance-contingent reward condition and lower under the controlling feedback/performance-contingent reward condition. Reward is an example of a controlling event that in itself may work against the positive effect of the information contained in the performance-contingent reward (Ryan & Deci, 2000). Although interest may be undermined by the prospect of reward during task performance, this effect may be offset by enhanced performance motivated by the expectation of reward (Deci & Ryan, 1985). Decision performance may not be undermined in the presence of informational feedback and performance-contingent rewards because cue value (Harackiewicz, Manderlink, & Sansone, 1984) may highlight the informational aspect of performance-contingent rewards and offset their controlling aspect. Harackiewicz et al. (1984) found that performance-contingent rewards enhanced intrinsic motivation relative to a condition where individuals were informed that they would be evaluated and subsequently received positive feedback but did not receive a reward. This result can be attributed to the absence of the reward containing the cue value that offsets the control contained within the evaluation. However, relative to a no-reward control
Effects of Feedback and Reward on Decision Performance
119
group that was not informed they would be evaluated but received positive feedback, performance-contingent reward did not increase intrinsic motivation. Finally, the research question examines whether the feedback characteristic of a decision aid interacts with reward to affect decision performance. Research Question: Do feedback and reward interact to affect decision performance?
EXPERIMENTAL METHOD AND PROCEDURES This research study used a 3 3 between-subjects fractional factorial design. The first manipulated variable was the type of feedback: no feedback, informational, or controlling feedback. The second manipulated variable was the type of reward: no reward, task-contingent or performance-contingent reward. Participants were randomly assigned to six experimental conditions based on the particular treatment combinations of interest.2 Participants completed two tasks: an apartment-selection task and a career-selection task. The purpose of the apartment-selection task was to train participants in the use of the decision aid required for completion of the experimental task – career selection. Participants A total of 151 undergraduate students participated in the study on a voluntary basis. Solicitation of voluntary participants for studies is difficult and was compounded by the fact that one of the variables being studied was reward structure. Voluntary participants generally expect that there will be some form of reward for their participation. To alleviate potential effects on our study from a priori participants’ expectations, support was garnered from student leaders who distributed information about the study and offered a specially designed T-shirt to students who were willing to attend the experimental sessions. The T-shirt was used as a task noncontingent reward in an effort to eliminate the participants’ expectation of rewards related to task completion and/or task performance. As noted earlier, task noncontingent rewards are not expected to affect motivation because individuals are not required to perform well in the activity, complete the task, or even perform the task (Deci et al., 1999). However, if participants anticipate that they will be rewarded, motivation can be undermined when rewards are not provided (Deci & Ryan, 1987; Lepper et al., 1973; Ross, 1975; Ryan et al., 1983).
120
SIEW H. CHAN ET AL.
Computerized Experimental Task A choice task was selected as the participants face choice decisions in many situations, and decision aids are commonly developed to assist users in making effective and efficient choices. Participants in this study performed two tasks – apartment and career selections. The choice tasks were simple enough to allow students, without prior exposure to the use of the decision aid to complete the requirements and make choices with minimal training. Prior research has shown that apartment selection is an effective experimental choice task for examining individual behavior (e.g., Chu & Spires, 2001; Lohse & Johnson, 1996; Payne, 1976; Todd & Benbasat, 1994a, 1994b). Career selection is also a widely studied topic in the psychology literature (e.g., Baker, 1999; Kinsella, 1998; Kapes, Whitfield, & Mastie, 1994). The apartment-selection task was used to train the participants in using the decision aid to facilitate completion of the experimental task, career selection. The System of Interactive Guidance (SIGI Plus) developed by the Educational Testing Service was used to obtain the attributes for the careerselection task. SIGI Plus is a career guidance system designed to assist individuals in making career decisions. The SIGI Plus requires responses (using one of four alternative responses: not important, desirable, very important, or essential) for 16 questions that deal with their personal values and job values. The set of questions pertaining to personal values includes contribution to society, income, independence, leadership, leisure, prestige, security, and variety. The set of questions related to job values include advancement, challenge, commuting, flexible hours, fringe benefits, on-thejob learning, co-workers, and mobility. These 16 questions were narrowed to eight attributes for the purpose of the career-selection task. The eight questions used by the decision aid include contribution to society, income, prestige, security, advancement, challenge, flexible hours, and fringe benefits. This selection was based on the ease of formulating meaningful choices for the attributes when the participants used the decision aid to complete the career-selection task. Each attribute had two levels of choices formulated in a manner that no one level clearly dominated the other level. The choice tasks involved selecting one alternative from a number of available alternatives. While a number of ways for processing information in choice tasks have been presented in prior research (Svenson, 1979), the additive difference (AD) processing strategy was incorporated in
Effects of Feedback and Reward on Decision Performance
121
the design of the decision aid. AD processing compares two alternatives simultaneously by comparing each attribute, finding the difference, and summing the differences. The AD processing strategy requires some method for weighting each attribute, a method of transformation to put all attributes into compensatory units, and a way to sum the weighted values of the attributes. After a series of alternative comparisons, the alternative with the greatest sum is chosen. AD processing is compensatory in that values on one attribute necessarily offset the values on another attribute. It makes more complete use of available information and is normatively more accurate than noncompensatory strategies such as elimination by aspects (Tversky, 1972). Use of the more accurate and effortful AD strategy relative to other less accurate and effortful strategies may be encouraged if users are provided with a decision aid that reduces the effort for completing a task. Each use of the decision aid resulted in a screen with a three-table display: a list of the attributes and attribute values for the first alternative, a list of attributes and attribute values for the second alternative, and a brief statement of how the alternatives differed on each attribute. PHP Trad and HTML-Kit editor were used to program the decision aid. The Apache Web server and MySQL database were required on each computer for the aid to function. Database tables in MySQL stored the data for the attributes, alternatives, and participants. These were dynamically queried to populate the data on the decision aid screens using a combination of java scripts and structured query language (SQL) codes. Once the users clicked on the ‘‘I am done’’ button at the end of the experiment, all of the participant’s data were written to the database.
Manipulation of Feedback While Ryan et al. (1983) manipulated feedback verbally, this study manipulated feedback via alert messages provided by the decision aid. To reinforce their manipulation, Ryan et al. also explained verbally to their participants the nature of the feedback they would receive. Replicating Ryan et al.’s process, the nature of the feedback (i.e., informational or controlling) provided to the participants was explained via information displayed on the screens prior to the apartment-selection (training) and career-selection (experimental) tasks. During both the training and experimental tasks, participants in the feedback conditions received feedback messages from the decision aid. The
122
SIEW H. CHAN ET AL.
feedback messages were modeled after those used by Ryan et al. (1983). The informational feedback group received the following messages: ‘‘You did well!,’’ ‘‘You did fairly well!,’’ or ‘‘You did very well!’’ The controlling feedback group received the same messages with an additional phrase ‘‘just as you should!,’’ ‘‘You did fairly well, just as you should!,’’ or ‘‘You did very well, just as you should!’’ We randomized the order of these messages. The responses received from pretest participants suggested little differences in the connotations of the phrases – ‘‘You did well!,’’ ‘‘You did fairly well!,’’ and ‘‘You did very well!’’ Prior to the experimental task, the informational feedback group was instructed that: ‘‘Following the career selection task, you will receive feedback on how well you are doing. Just do as well as you can.’’ The controlling feedback group was instructed: ‘‘Following the career selection task, you will receive feedback on whether you are performing as well as you should. You should try as hard as possible because you are expected to perform up to standards on the career selection task.’’
Manipulation of Reward Immediately before beginning the career-selection task, the task-contingent reward group received information via the decision aid that they would receive $5 for doing the career-selection task. Accordingly, each of the participants received $5 at the end of the study. Prior to commencement of the career-selection task, the performancecontingent group received the following information via the decision aid: ‘‘We received some extra money from a grant, so we will be able to pay those who do well at the career-selection task. The amount of your payment will depend on your performance. Your personal preference equation will be computed from your ratings of 16 different careers and this will be compared with your final career choice selection. The amount of your payment; i.e., $10, $5, or $2 will appear at the end of this study.’’ The database contained id numbers with randomly preselected reward amounts. Participants preselected for the $10 reward saw the message ‘‘You did very well!’’ or ‘‘You did very well, just as you should!’’ Those preselected for the $5 reward saw the message ‘‘You did fairly well!’’ or ‘‘You did fairly well, just as you should!’’ Finally, individuals pre-selected for the $2 reward saw the message ‘‘You did well!’’ or ‘‘You did well, just as you should!’’ The no-reward group did not receive any money at the end of the study. The task-contingent and performance-contingent groups were exposed to
Effects of Feedback and Reward on Decision Performance
123
manipulation of the reward immediately prior to the start of the careerselection task. Thus, the participants expected the reward while they were doing the experimental task. A requirement of cognitive evaluation theory is that the reward must be expected while a person is doing the task for the theorized effects on motivation to hold (Deci et al., 1999).
Experimental Procedure Each experimental session was scheduled for 45 min, and this was generally adequate time for the participants to complete the training and experimental tasks. The experiment was administered in a university computer laboratory with all participants using identical desktop computers linked through a local area network. One of the researchers and two research assistants administered each experimental session. Participants were randomly assigned to each experimental session to balance the effects of time of day, day of week, and any other extraneous factors that might affect the particular experimental setting. At the start of the experiment, participants were given a card showing their user name and assigned identification numbers. The user name and assigned identification numbers were used to log onto the system to trace the data for proper storage in the database. The identification number served the dual purpose of a password and a treatment code to trigger the loading of the proper treatment decision aid. All identification numbers were required to be entered twice to assure proper assignment to the intended treatment condition. After each participant entered their identification numbers, they were provided with an overview of the decision aid on the screen. The next screen explained the details of what the experimental process entailed. Specifically, this screen discussed the six stages of the experimental process. First, participants rated three different apartments based on a set of six attributes (rent, number of bedrooms, size, distance to public transport, distance from work, surrounding environment). Second, they completed an online tutorial on using the decision aid to perform the choice task. This tutorial had a set of four apartments and four attributes (rent, number of bedrooms, size, and distance to public transport). Participants could click on the ‘‘Help’’ button at any time during the tutorial to review the directions for using the decision aid. At the end of the tutorial, they were required to rate their understanding of the tutorial on a scale from 1 to 100 with ‘‘1 ¼ least and 100 ¼ most.’’ The purpose of the tutorial was to prepare participants for the apartmentselection task (Stage III of the study) that required them to select their most
124
SIEW H. CHAN ET AL.
preferred apartment out of three different groups of apartments, each with five apartments and six attributes. The attributes were the same six attributes that they saw during their ratings of the three different groups of apartments. The informational and controlling feedback groups saw an alert message after each apartment selection. The messages that the informational feedback group saw were ‘‘You did well!,’’ ‘‘You did fairly well!,’’ and ‘‘You did very well!’’ (see Fig. 1). The controlling feedback group saw the same messages with an additional phrase, ‘‘just as you should’’ (see Fig. 2). These alert messages mirrored the experimental manipulations for feedback to get the participants acclimated to the message (i.e. no surprises when a new alert message showed up during the experimental task). These messages appeared for 5 s before the decision aid went to the next screen. The purpose of this delay was to ensure that the participants actually saw the messages. The nofeedback group did not see any message. The fourth stage marked the beginning of the actual experimental process – the first three stages all relating to training and preparation for the
Fig. 1.
Career Selection (Informational Feedback).
Effects of Feedback and Reward on Decision Performance
Fig. 2.
125
Career Selection (Controlling Feedback).
experiment. Participants began the experimental process by using a computer slide bar (on a scale from 1 to 100) to rate their likelihood of choosing 16 different careers based on a set of eight attributes (see Fig. 3). The purpose of these ratings was to gather information on the participants’ assessment of the relative importance (i.e. utility) for each attribute. Upon completion of the relative importance assessment, the task-contingent reward group saw a screen indicating that they would receive a $5 reward for doing the task. The performance-contingent reward group saw a screen telling them that the amount of their reward (i.e., $10, $5 or 2) would depend on their performance. These screens established the reward manipulation in the experiment. Participants then moved on to the fifth stage of the experimental process where they completed the career-selection task. Participants chose their most preferred career from a group of eight different careers based on a set of eight attributes (see Fig. 4). These attributes were the same attributes that they saw during their ratings of 16 different careers. At the end of the
126
SIEW H. CHAN ET AL.
Fig. 3.
Career Rating Task.
career-selection task, participants in the informational (controlling) feedback/performance-contingent reward condition selected for the $10 reward saw the message ‘‘You did very well!’’ (‘‘You did very well, just as you should!’’). Those selected for the $5 reward saw the message ‘‘You did fairly well!’’ or ‘‘You did fairly well, just as you should!’’ Participants selected for the $2 reward saw the message ‘‘You did well!’’ or ‘‘You did well, just as you
Effects of Feedback and Reward on Decision Performance
Fig. 4.
127
Career Selection Task.
should!’’ The informational (controlling) feedback/no-reward participants saw the message ‘‘You did well!’’ (‘‘You did well, just as you should!’’). The sixth and final stage of the experimental process required participants to provide their demographic information and complete a series of questions on career selection and use of the decision aid to do the task. All data captured during the experiment, including the demographic data, were written out to the MySQL database once the participants
128
SIEW H. CHAN ET AL.
clicked on the ‘‘I am done’’ button in the final stage. Upon clicking the ‘‘I am done’’ button they saw a screen thanking them for their participation in the study.
Dependent Variable: Decision Performance The motivation literature is replete with studies (e.g., Calder & Staw, 1975; Eisentein, 1985; Griffith, 1984; Newman & Layton, 1984; Smith, 1980; Wilson, 1978) that manipulate intrinsic motivation (or interest) to examine its impact on indirect measures of decision performance such as a person’s subsequent free-choice behavior or self-reported interest. Existing accounting and information systems research has also not developed a reliable measure of decision performance. Rather, decision performance has been measured indirectly via time taken to make a decision (e.g., Kottemann, Davis, & Remus, 1994; Paquette & Kida, 1988); speed and accuracy (e.g., Amer, 1991; Eining & Dorr, 1991; Hunton & McEwen, 1997); user satisfaction (Gibson, 1994; Hunton, 1996); decision quality (Gonzalez & Kasper, 1997), changes in heuristic bias (Arnold, Collier, Leech, & Sutton, 2004), and shift in decision making toward a system’s recommendation (Arnold, Clark, Collier, Leech, & Sutton, 2006) – none of which is necessarily an accurate representation of decision performance. The basic problem in measuring decision performance comes down to the trade-off between deterministic decisions where there is a right answer but the decision process is fairly straightforward versus judgment-oriented decisions where there is no ‘‘right’’ answer, but rather the end solution is a judgment cause. The latter situation is of most interest in a decision aid environment as most decision aids are designed to facilitate judgment-based rather than deterministic decisions. This study introduces an alternative measure with a focus on the decision process and related outcomes. An individual’s preferred decision process can be determined in a multiattribute choice task if the underlying utility function of that individual can be derived. A utility function is a model that represents the relative importance of different attributes to a given decision maker. If the underlying utility function is understood, then decision performance can be assessed based on the effectiveness with which an individual is able to aggregate various attribute values and assimilate those values into a decision that matches the individual’s actual utility function. This measure has an advantage over other existing measures in the decision performance literature in that it provides an immediate and reliable assessment of the qualitative aspect of decision performance.
Effects of Feedback and Reward on Decision Performance
129
Conjoint analysis is used to determine each participant’s utility function. Conjoint analysis is a commonly used method for predicting individual preferences in a multiattribute and multialternative task (Green & Srinivasan, 1978, 1990). The technique is widely used in marketing research to assess consumers’ utility for alternative products or services (Green, Krieger, & Wind, 2001). The evolution of conjoint analysis is documented in the seminal work of Luce and Tukey (1964). Several psychometricians (e.g., Carroll, 1969; Kruskal, 1965; Young, 1972) have used the theoretical contributions to develop models for determining respondents’ preferences for multiattribute products or services. In conjoint analysis, participants see a set of alternatives based on different levels of attributes. The participants then pick their most preferred alternative in the choice set by allocating a total of 100 points across the alternatives to indicate their likelihood of choosing the alternatives based on the given attribute values (Carroll & Green, 1995). The data obtained from these assessments contain the individuals’ utility functions (Green et al., 2001). In this study, a regression model was run for each participant to obtain a set of coefficients for determining the individual’s utility function. Consistent with the literature, the experimental task consisted of a rating task and a selection task. Participants rated 16 different alternatives (careers based on a set of attribute values). The rating task had 16 alternatives; each alternative was described by eight attributes with one of two attribute values. The data obtained from the rating task consisted of 16 scores for each participant. For each participant, a regression model was run to obtain a set of coefficients to determine the individual’s utility function. The decision performance measure was obtained by comparing each individual’s utility function with his or her choice in the experimental selection task. Decision performance was considered the best (i.e., coded as 1) if the participants’ first best choice using the utility function matched their most preferred choice in the experimental task. The second best and third best decision performance were coded as 2 and 3 respectively. The remaining levels of decision performance (i.e., fourth or worse) were coded as 4.
RESULTS Demographics and Control Tests All 151 participants completed the experiment. Their ages ranged from 18 to 24 and the mean was 21. Their self-reported computer proficiency3 was 4.3 on
130
SIEW H. CHAN ET AL.
a scale from 1 to 7 with 1 ¼ very low and 7 ¼ very high. Tests were also conducted to measure initial intrinsic and extrinsic motivation. Control tests for potential effects indicated that the participants’ demographics (i.e., age, gender, major, computer proficiency, initial intrinsic motivation, and initial extrinsic motivation) did not have an impact on decision performance. A manipulation check was also included to measure the participants’ understanding on use of the decision aid. An average rating of 78 on a scale of 1–100 indicated that participants felt they had a good understanding of the decision aid. Further control tests suggested that participants’ ratings on their understanding of the decision aid were not related to their decision performance.
Results for H1 (Feedback) Table 1 presents the descriptive statistics for decision performance. As Panel A indicates, consistent with Ryan et al. (1983), our 3 3 experimental design has some missing cells. Panels B and C show the descriptive statistics for decision performance for the types of feedback and types of reward respectively. Based on the fractional factorial design applied in the study, the hypotheses are tested using independent samples t-tests. Hypothesis 1 examines whether feedback has an impact on decision performance (see Table 2). Results indicate that the controlling feedback groups perform better than the nofeedback groups ( p ¼ 0.01). However, no significant differences are found between the no-feedback and informational feedback conditions, and between the informational feedback and controlling feedback conditions4. Similar to Ryan et al.’s (1983) findings for the intrinsic motivation measure ( p ¼ 0.06), the results show that users who use a decision aid with the feedback characteristic perform better than those who use a decision aid without the feedback characteristic ( p ¼ 0.02). Thus, H1a and H1c were not supported, but H1b and H1d were supported.5 While the results for H1a and H1c are surprising, the lack of significance may be related to the interplay among decision aids, technology and associated user behavior, which leads to greater acceptance of controlling feedback by users. The theory of technology dominance suggests that systems users in nondeterministic tasks will often allow the system to take a position of dominance in the human–computer relationship (Arnold & Sutton, 1998). This seems particularly plausible in this situation given the participants’ responses to a question on the extent to which the decision aid
131
Effects of Feedback and Reward on Decision Performance
Table 1. Descriptive Statistics for Decision Performance*. Panel A: 3 3 Fractional Factorial Design Feedback
Reward None
None Informational Controlling
Task contingent
Mean
Std. Dev.
Mean
Std. Dev.
3.38 3.36 3.12
0.898 1.075 0.927
3.80
0.500
Performance contingent Mean
Std. Dev.
3.36 3.16
1.075 1.106
Panel B: Types of Feedback Feedback None Informational Controlling
Mean
Std. Dev.
3.59 3.36 3.14
0.75 1.06 1.01
Panel C: Types of Reward Reward None Task-contingent Performance-contingent
Mean
Std. Dev.
3.29 3.80 3.26
0.96 0.50 1.08
*Decision performance had the following four values: 1, 1st best choice; 2, 2nd best choice; 3, 3rd best choice; 4, 4th or worse choice.
provided positive feedback. Compared to the no-feedback and informational feedback groups, the controlling feedback group felt much more strongly that the decision aid provided them with positive feedback on how well they were doing ( p ¼ 0.01 and p ¼ 0.02 respectively).
Results for H2 (Rewards) H2 examines the impact of alternative types of reward on decision performance (see Table 2). The results indicate that the no-reward group performed better than the task-contingent reward group ( p ¼ 0.00), supporting H2a. The results for decision performance are consistent with Ryan et al.’s (1983) findings for intrinsic motivation. The findings also show
132
SIEW H. CHAN ET AL.
Table 2. Results for H1 and H2. Panel A: Effect of Types of Feedback on Decision Performance* Type of feedback No feedback Informational feedback No feedback Controlling feedback Informational feedback Controlling feedback No feedback Feedback
n
Performance (mean)
p-value
51 50 51 50 50 50 51 100
3.59 3.36 3.59 3.14 3.36 3.14 3.59 3.25
0.22 0.01 0.29 0.02
Panel B: Effect of Types of Reward on Decision Performance Type of reward
n
Performance (mean)
p-value
No reward Task-contingent No reward Performance-contingent Task-contingent Performance-contingent
76 25 76 50 25 50
3.29 3.80 3.29 3.26 3.80 3.26
0.00 0.88 0.00
*Decision performance construct had the following four values: 1, 1st best choice; 2, 2nd best choice; 3, 3rd best choice; 4, 4th or worse choice.
that the performance-contingent reward group perform better than the task-contingent reward group ( p ¼ 0.00), supporting H2c. However, no significant difference is found between the no-reward and performancecontingent reward conditions; therefore, H2b6 is not supported. The lack of support for H2b may be related to the high motivation that participants naturally had due to the nature of the task. Students using a decision aid designed to help them identify preferred career choices may have been highly motivated to perform well out of self-interest in the decision outcomes supported by the decision aid. This may have muted the effects of the performance-contingent reward in increasing motivation. Rather, because of the innate motivation to perform, the effects found in H2a and H2c may be driven primarily by the expected negative effect on intrinsic motivation of providing task-contingent rewards. An undermining effect on the intrinsic motivation of individuals offered a task-contingent reward versus no reward is predictable (e.g., Deci, 1972; Eisenberger &
Effects of Feedback and Reward on Decision Performance
133
Cameron, 1996; Lepper et al., 1973; Ryan & Deci, 1996; Sansone & Harackiewicz, 1998).
Results for the Joint Effect of Feedback and Reward The research question examines whether feedback interacts with reward to affect decision performance. This question focuses on the areas where Ryan et al. (1983) found differences and provides a replication of the effects in the context of a decision aid environment as the conditions relate to decision performance. Ryan et al. (1983) found that intrinsic motivation under the performancecontingent reward condition was lower relative to the no-reward condition, when feedback was present ( po0.05). This study does not find similar significant results for our decision performance measure. However, this may be related to the perceived task interest effects on motivation as noted in the discussion of the results for H2. Ryan et al. (1983) also reported that the informational feedback/ performance-contingent reward group had higher intrinsic motivation than the controlling feedback/performance-contingent reward and the no-feedback/task-contingent reward groups. The findings of the current study indicate that decision performance does not differ between the informational feedback/performance-contingent reward and controlling feedback/ performance-contingent groups. This result is likely a function of the participants’ positive response to controlling feedback in a decision aid environment as discussed in the results for H1. Consistent with Ryan et al.’s finding for intrinsic motivation ( po0.04), the results of this study indicate that the informational feedback/performance-contingent reward group marginally outperformed the no-feedback/task-contingent reward group ( p ¼ 0.07). Finally, the results indicate that the controlling feedback/performancecontingent reward group perform better than the no-feedback/task-contingent reward group ( p ¼ 0.01); this contradicts Ryan et al.’s finding of no significant difference between the conditions for their intrinsic motivation measure. However, the combination of the positive response of participants in this study to the controlling feedback in a decision aid environment coupled with the positive effect theorized for performance-contingent rewards on decision performance (as opposed to the negative effect theorized for intrinsic motivation in Ryan et al’s study), this alternative finding is actually not surprising. The results are presented in Table 3.
134
SIEW H. CHAN ET AL.
Table 3.
Interaction between Feedback and Reward on Decision Performance.
Type of Feedback/Reward
n
Performance* (Mean)
Our Study’s p-Value (Performance)
Ryan et al.’s p-Value (Intrinsic Motivation)
Feedback/no reward Feedback/performance-contingent reward Informational feedback/performancecontingent reward Controlling feedback/performancecontingent reward Informational feedback/performancecontingent reward No feedback/task-contingent reward No feedback/task-contingent reward Controlling feedback/performancecontingent reward No feedback/no reward No feedback/task-contingent reward
50 50
3.28 3.29
0.92
o0.05
25
3.36
0.52
o0.02
25
3.16
25
3.36
0.07
o0.04
25 25 25
3.80 3.80 3.16
0.01
n.s.
26 25
3.38 3.80
0.05
o0.05
*Decision performance had the following four values: 1, 1st best choice; 2, 2nd best choice; 3, 3rd best choice; 4, 4th or worse choice.
IMPLICATIONS OF RESEARCH Implications for Designers This research indicates that individuals using a decision aid with a feedback characteristic perform better than those using a decision aid without a feedback characteristic. The results also show that individuals receiving positive feedback, regardless of the informational or controlling nature of its administration, perform better than the no-feedback group. In particular, the favorable view of the participants in the controlling feedback group as to the level of positive feedback may indicate that feedback designs in human– computer interactions may be more effective in modes that differ from those preferred in human–human interactions. Overall, these results also provide evidence supporting the call by Johnson et al. (2004) for designers to incorporate positive feedback in their designs of decision aids. Positive feedback appears to lead to favorable user perceptions of a decision aid, which in turn leads to improved decision performance.
Effects of Feedback and Reward on Decision Performance
135
The results of this study also indicate that task-contingent rewards undermine decision performance relative to no reward, while performancecontingent rewards enhance decision performance relative to task-contingent rewards. As such, designers should be cognizant of the type of reward structures that exist in decision aid supported environments. Incorporation of appropriate reward structures in the design of computeraided environments appears to be important and can impact user perceptions of decision aids and their decision performance.
Implications for Future Research This research study extends the study by Ryan et al. (1983) in the context of decision aid use in task completion. Ryan et al.’s work is also extended in terms of examination of decision performance as opposed to a precursor condition to performance – intrinsic motivation. Accordingly, the results are mixed when compared to those of Ryan et al., most likely due to the alternative focus of this study. First, this study uses decision performance as the dependent measure while Ryan et al. use intrinsic motivation as their dependent measure. In a decision-aided environment, the end decision performance effect is of greater concern than the users’ contentedness with the software and task. Future research, however, may want to explore whether the text-based feedback incorporated into the decision aid interface has similar or differing effects on intrinsic motivation. This may be of particular interest in human–computer interactions that occur in environments other than decision making, such as systems designed to support learning and training. There are other differences between Ryan et al.’s (1983) study and this study that may also warrant further exploration. For instance, the participants of this study were accounting students at a non-U.S. university. This raises questions as to whether differences in national culture, professional culture, or educational background might affect users’ reactions to reward structures and feedback characteristics. Related to this, of course, is the issue of how each of these cultural and educational aspects might affect interactions with computers themselves, as well as the commingled effects of reward and feedback. This study opens up several other research possibilities as well. First, future research can manipulate user perceptions of a task (e.g., interest, importance, utility, or opportunity cost) to examine their impact on intrinsic motivation, extrinsic motivation, and decision performance. The results of this study
136
SIEW H. CHAN ET AL.
certainly provide some indication that task interest influences results. Second, the impact of factors such as a person’s motivational orientation, other decision environmental factors (i.e., accountability, justification, and time constraint), task characteristics (i.e., complexity, difficulty, structure, ambiguity, and novelty), and user characteristics (i.e., ability, knowledge, and experience) on intrinsic motivation in decision aid environments and subsequent decision performance is also of interest. Third, future work can investigate how intrinsic motivation interacts with perceived effectiveness, efficiency, or effort in using a decision aid to affect motivation to use the decision aid. Fourth, additional work can examine how other decision aid characteristics (such as ease of use, presentation format, system restrictiveness, decisional guidance, and interaction support) influence intrinsic motivation. Finally, researchers can help designers understand how decision aid characteristics can be manipulated to obtain favorable user perceptions that increase intrinsic motivation, motivation to use a decision aid, actual decision aid use, and decision performance (Chan, 2005).
LIMITATIONS The standard limitations of laboratory experiments apply to this study. A major criticism of laboratory experiments is their limited external validity. In this study, the task and the decision aid were new to the participants, and while the participants may have been making decisions that were familiar to them, the multiattribute evaluation strategy may have been more structured than the decision process they would typically apply. The presence of a familiar decision aid and/or a familiar day-to-day task might alter the behavior patterns of decision makers. Additionally, while the task and decision aid were applicable to the participants in the study, professionals making highly complex decisions may exhibit different traits in the use of decision aids under the associated feedback and reward conditions.
CONCLUDING REMARKS The common belief among the research community is that usage of humanlike characteristics in designing systems interfaces enhances the human acceptance of the computer in human–computer interaction environments (e.g. Reeves & Nass, 1996; Johnson et al., 2006). The use of human-like characteristics brings with it the emotive characteristics of human
Effects of Feedback and Reward on Decision Performance
137
communication and conveys a broader message than simply the factual words presented. The symbolic or embedded connotations within the message can influence users’ attitudes toward the system (Shirk, 1988). The research reported in this paper introduces the seminal work of Ryan et al. (1983) into the discourse over systems interface design. As such, this study examined the impact of informational and controlling feedback, as well as task-contingent and performance-contingent rewards, on the decision performance of aided users. The introduction of user utility functions to assess decision performance provides a reliable measure for assessing the effectiveness of decision-making in nondeterministic decision environments. Use of the utility functions enables examination of actual decision performance as opposed to the underlying motivation that determine effort and in turn influence performance. The results provide new insights into how feedback and rewards affect decision makers operating in a decision-aided environment and how feedback affects performance when it is driven by computer interaction. Given the prevalence of such use in professional decision-making environments such as accounting, additional research in the area is warranted. Design features of systems appear critical to establishing a comfort level with the user to help establish cognitive fit – a key antecedent to reliance on a decision aid (Arnold & Sutton, 1998). Additionally, as audit firms and other organizations weight incentives for encouraging use of decision aids, a better understanding of the impact of incentives on acceptance and use is needed. The research reported in this study provides some additional perspectives on the influence of design and incentives on decision aid use and its impact on decision performance. Further research is of great need in enhancing this understanding and developing better insights into the mechanisms that are most useful in encouraging appropriate use of decision aids in accounting and other professional decision-making environments.
NOTES 1. Effort has been established in the motivation literature as a function of intrinsic and extrinsic motivation. Performance is viewed as a function of effort and ability. Hence, while ability is static for a user, increases in intrinsic or extrinsic motivation will increase effort and lead to better performance. 2. Based on this study’s interest in replicating Ryan et al.’s (1983) findings within a decision aid context, for task-contingent rewards the interest is in the no-feedback condition and for the performance-contingent reward the interest is in the informational and controlling feedback conditions.
138
SIEW H. CHAN ET AL.
3. The computer proficiency scale comprised proficiency in word processing, spreadsheet, DSS, database, graphics, email, and programming language. Each participant’s responses to the items on this scale were summed and averaged to produce a composite computer proficiency score. 4. The pressure/tension subscale of Ryan et al.’s (1983) Intrinsic Motivation Inventory was used to measure the extent of pressure/tension that the participants felt while using the decision aid with no-feedback, informational or controlling feedback. These groups did not report significant differences in pressure/tension in doing the task. 5. Similar results were obtained when the Tukey HSD statistical method was used to compare the feedback groups and ANOVA was used to compare the presence (i.e., the combined informational and controlling feedback groups) versus absence of feedback. 6. Similar results were obtained when the Tukey HSD statistical method was used to compare the reward groups.
ACKNOWLEDGMENTS The authors thank the University of Massachusetts at Boston and Nanyang Technological University, Singapore, for funding their research study. They are grateful to Dr. William L. Moore of the University of Utah for his help with the conjoint analysis for their decision performance measure. The authors also thank Mr. Jiqing Gao for coding their program.
REFERENCES Amabile, T. M. (1983). The social psychology of creativity. New York: Springer-Verlag. Amer, T. (1991). An experimental investigation of multi-cue financial information display and decision making. Journal of Information Systems, 5, 18–34. Arnold, V., Clark, N., Collier, P. A., Leech, S. A., & Sutton, S. G. (2006). The differential use and effect of knowledge-based system explanations in novice and expert judgment decisions. MIS Quarterly, 30(1), 79–97. Arnold, V., Collier, P. A., Leech, S. A., & Sutton, S. G. (2004). The impact of intelligent decision aids on experienced and novice decision makers’ judgments. Accounting and Finance, March, 1–26. Arnold, V., & Sutton, S. G. (1998). The theory of technology dominance: Understanding the impact of intelligent decision aids on decision makers’ judgments. Advances in Accounting Behavioral Research, 1, 175–194. Awasthi, V., & Pratt, J. (1990). The effects of monetary incentives on effort and decision performance: The role of cognitive characteristics. Accounting Review, 65(4), 797–811. Baker, H. E. (1999). Attachment style and career development among college-aged adults. Doctoral dissertation, University of San Francisco.
Effects of Feedback and Reward on Decision Performance
139
Boggiano, A. K., & Ruble, D. N. (1979). Competence and the overjustification effect: A developmental study. Journal of Personality and Social Psychology, 37, 1462–1468. Calder, B. J., & Staw, B. M. (1975). Self-perception of intrinsic and extrinsic motivation. Journal of Personality and Social Psychology, 31, 599–605. Cameron, J., & Pierce, W. D. (1994). Reinforcement, reward, and intrinsic motivation: A metaanalysis. Review of Educational Research, 64, 363–423. Carroll, J. D. (1969). Categorical conjoint measurement. Paper presented at Meeting of Mathematical Psychology, Ann Arbor, MI (August). Carroll, J. D., & Green, P. E. (1995). Psychometric methods in marketing research: Part I, conjoint analysis. Journal of Marketing Research, 2(4), 385–391. Chan, S. H. (2005). A motivational framework for understanding IS use and decision performance. Review of Business Information Systems, 9(4), 101–117. Chu, P. C., & Spires, E. E. (2001). Does time constraint on users negate the efficacy of decision support systems? Organizational Behavior and Human Decision Processes, 85(2), 226–249. Condry, J. (1977). Enemies of exploration: Self-initiated versus other-initiated learning. Journal of Personality and Social Psychology, 35(7), 459–477. Daniel, T. L., & Esser, J. K. (1980). Intrinsic motivation as influenced by rewards, task interest, and task structure. Journal of Applied Psychology, 65(5), 566–573. Davis, F. D. (1989). Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly, 13(3), 319–339. Deci, E. L. (1971). Effects of externally mediated rewards on intrinsic motivation. Journal of Personality and Social Psychology, 18, 105–115. Deci, E. L. (1972). Intrinsic motivation, extrinsic reinforcement, and inequity. Journal of Personality and Social Psychology, 22, 113–120. Deci, E. L. (1992). The relation of interest to the motivation of behavior: A self-determination theory perspective. In: A. K. Renninger & S. Hidi (Eds), The role of interest in learning and development (pp. 43–70). Hillsdale, NJ: Erlbaum. Deci, E. L., Koestner, R., & Ryan, R. M. (1999). A meta-analytic review of experiments examining the effects of extrinsic rewards on intrinsic motivation. Psychological Bulletin, 125(6), 627–668. Deci, E. L., & Ryan, R. M. (1985). The general causality orientations scale: Self-determination in personality. Journal of Research in Personality, 19, 109–134. Deci, E. L., & Ryan, R. M. (1987). The support of autonomy and the control of behavior. Journal of Personality and Social Psychology, 53(6), 1024–1037. Dowling, C. (2009). Appropriate audit support system use: The influence of auditor, audit team and firm factors. The Accounting Review, 84(3), forthcoming. Dowling, C., & Leech, S. A. (2007). Audit support systems and decision aids: Current practice and opportunities for future research. International Journal of Accounting Information Systems, 8(2), 92–116. Dweck, C. S. (1986). Motivational processes affecting learning. American Psychologist, 41, 1040–1048. Eining, M. M., & Dorr, P. B. (1991). The impact of expert system usage on experiential learning in an auditing setting. Journal of Information Systems, 5, 1–16. Eisenberger, R., & Cameron, J. (1996). Detrimental effects of reward: Reality or myth? American Psychologist, 51(11), 1153–1166.
140
SIEW H. CHAN ET AL.
Eisenberger, R., Pierce, W. D., & Cameron, J. (1999). Effects of reward on intrinsic motivation – negative, neutral, and positive: Comment on Deci, Koestner, and Ryan (1999). Psychological Bulletin, 125, 677–691. Eisentein, N. (1985). Effects of contractual, endogenous, or unexpected rewards on high and low interest preschoolers. The Psychological Record, 35, 29–39. Fogg, B. J., & Nass, C. (1997). Silicon sycophants: The effects of computers that flatter. International Journal of Human–Computer Studies, 46, 551–561. Gibson, D. L. (1994). The effects of screen layout and feedback type on productivity and satisfaction of occasional users. Journal of Information Systems, 8(2), 105–114. Gonzalez, C., & Kasper, G. M. (1997). Animation in user interfaces designed for decision support systems: The effects of image abstraction, transition, and interactivity on decision quality. Decision Sciences, 28(4), 793–823. Graham, M., Kennedy, J., & Benyon, D. (2000). Towards a methodology for developing visualizations. International Journal of Human–Computer Studies, 53, 789–807. Green, P. E., Krieger, A. M., & Wind, Y. (2001). Thirty years of conjoint analysis: Reflections and prospects. Interface, 31, S56–S73. Green, P. E., & Srinivasan, V. (1978). Conjoint analysis in consumer research: Issues and outlook. Journal of Consumer Research, 5, 103–123. Green, P. E., & Srinivasan, V. (1990). Conjoint analysis in marketing: New development with implications for research and practice. Journal of Marketing, 54, 3–19. Griffith, K. M. (1984). The effects of group versus individual context initial interest and reward on intrinsic motivation. Doctoral dissertation, University of South Florida. Harackiewicz, J. M. (1979). The effects of reward contingency and performance feedback on intrinsic motivation. Journal of Personality and Social Psychology, 37, 1352–1363. Harackiewicz, J. M., Abrahams, S., & Wageman, R. (1987). Performance evaluation and intrinsic motivation: The effects of evaluative focus, rewards, and achievement orientation. Journal of Personality and Social Psychology, 53, 1015–1023. Harackiewicz, J. M., & Manderlink, G. (1984). A process analysis of the effects of performance contingent rewards on intrinsic motivation. Journal of Experimental Social Psychology, 20, 531–551. Harackiewicz, J. M., Manderlink, G., & Sansone, C. (1984). Rewarding pinball wizardry: The effects of evaluation on intrinsic interest. Journal of Personality and Social Psychology, 47, 287–300. Harackiewicz, J. M., & Sansone, C. (2000). Rewarding competence: The importance of goals in the study of intrinsic motivation. In: C. Sansone & J. M. Harackiewicz (Eds), Intrinsic and extrinsic motivation: The search for optimal motivation and performance (pp. 79–103). San Diego: Academic Press. Hunton, J. E. (1996). Involving information system users in defining system requirements: The influence of procedural justice perceptions on user attitudes and performance. Decision Sciences, 27(4), 647–671. Hunton, J. E., & McEwen, R. A. (1997). An assessment of the relation between analysts’ earnings forecast accuracy, motivational incentives and cognitive information search strategy. Accounting Review, 72(4), 497–515. Johnson, D., Gardner, J., & Wiles, J. (2004). Experience as a moderator of the media equation: The impact of flattery and praise. International Journal of Human–Computer Studies, 61(3), 237–258.
Effects of Feedback and Reward on Decision Performance
141
Johnson, D. & Wiles, J. (2004). Effective affective user interface design in games. Affective human factors design. Singapore: ASEAN; New York: Academic Press. Johnson, R. D., Marakas, G. M., & Palmer, J. W. (2006). Differential social attributions toward computing technology: An empirical investigation. International Journal of Human–Computer Studies, 64(5), 446–460. Kapes, J. T., Whitfield, E. A., & Mastie, M. M. (1994). A counselor’s guide to career assessment instruments. Columbus, OH: National Career Development Association. Kettanurak, V., Ramamurthy, K., & Haseman, W. D. (2001). User attitude as a mediator of learning performance improvement in an interactive multimedia environment: An empirical investigation of the degree of interactivity and learning styles. International Journal of Human–Computer Studies, 54, 541–583. Kinsella, S. (1998). A cross-discipline study of traditional and nontraditional college students. College Student Journal, 32(4), 532–538. Klein, B. D., Goodhue, D. L., & Davis, G. B. (1997). Can humans detect errors in data? Impact of base rates, incentives, and goals. MIS Quarterly, 21(2), 169–194. Kottemann, J. E., Davis, F. D., & Remus, W. E. (1994). Computer-assisted decision making: Performance, beliefs, and the illusion of control. Organizational Behavior and Human Decision Processes, 57, 26–37. Kruskal, J. B. (1965). Analysis of factorial experiments by estimating monotone transformations of the data. Journal of the Royal Statistical Society, Series B, 27, 251–263. Lepper, M. R. (1981). Intrinsic and extrinsic motivation in children: Detrimental effects of superflous social controls. In: W. A. Collins (Ed.), Aspects of the development of competence: The Minnesota symposium on child psychology (Vol. 14, pp. 155–214). Hillsdale, NJ: Erlbaum. Lepper, M. R., Greene, D., & Nisbett, R. E. (1973). Undermining children’s intrinsic interest with extrinsic reward: A test of the ‘‘overjustification’’ hypothesis. Journal of Personality and Social Psychology, 28(1), 129–137. Lohse, G. L., & Johnson, E. J. (1996). A comparison of two process tracing methods for choice tasks. Organizational Behavior and Human Decision Processes, 68(1), 28–43. Luce, R. D., & Tukey, J. W. (1964). Simultaneous conjoint measurement. Journal of Mathematical Psychology, 1, 1–27. Newman, J., & Layton, B. D. (1984). Overjustification: A self-perception perspective. Personality and Social Psychology Bulletin, 10, 419–425. Nicholls, J. G. (1984). Achievement motivation: Conceptions of ability, subjective experience, task choice, and performance. Psychological Review, 91, 328–346. Pace, S. (2004). A grounded theory of the flow experiences of Web users. International Journal of Human–Computer Studies, 60, 327–363. Paquette, L., & Kida, T. (1988). The effect of decision strategy and task complexity on decision performance. Organizational Behavior and Human Decision Processes, 41, 128–142. Payne, J. W. (1976). Task complexity and contingent processing in decision making: An information search and protocol analysis. Organizational Behavior and Human Decision Processes, 16, 366–387. Reeves, B., & Nass, C. (1996). The media equation: How people treat computers, television and new media like real people and places. Stanford, CA: Cambridge University Press.
142
SIEW H. CHAN ET AL.
Ross, M. (1975). Salience of reward and intrinsic motivation. Journal of Personality and Social Psychology, 32, 245–254. Rose, J. M. (2002). Behavioral decision aid research: Decision aid use and effects. In: V. Arnold & S. G. Sutton (Eds), Researching accounting as an information systems discipline (pp. 111–134). Sarasota, FL: American Accounting Association. Ryan, R. M. (1982). Control and information in the intrapersonal sphere: An extension of cognitive evaluation theory. Journal of Personality and Social Psychology, 43, 450–461. Ryan, R. M., & Deci, E. L. (1996). When paradigms clash: Comments on Cameron and Pierce’s claim that rewards do not undermine intrinsic motivation. Review of Educational Research, 66, 33–38. Ryan, R. M., & Deci, E. L. (2000). When rewards compete with nature: The undermining of intrinsic motivation and self-regulation. In: C. Sansone & J. M. Harackiewicz (Eds), Intrinsic and extrinsic motivation: The search for optimal motivation and performance (pp. 257–307). San Diego: Academic Press. Ryan, R. M., Mims, V., & Koestner, R. (1983). Relation of reward contingency and interpersonal context to intrinsic motivation: A review and test using cognitive evaluation theory. Journal of Personality and Social Psychology, 45, 736–750. Sansone, C., & Harackiewicz, J. M. (1998). ‘‘Reality’’ is complicated: Comment on Eisenberger & Cameron. American Psychologist, 53, 673–674. Shalley, C. E., & Perry-Smith, J. E. (2001). Effects of social–psychological factors on creative performance: The role of informational and controlling expected evaluation and modeling experience. Organizational Behavior and Human Decision Processes, 84(1), 1–22. Shirk, H. N. (1988). Technical writers as computer scientists: The challenges of online documentation. In: E. Barrett (Ed.), Text, context, and hypertext: Writing with and for the computer. Cambridge, MA: MIT Press. Smith, A. T. (1980). Effects of symbolic reward and positive feedback on high and low levels of intrinsic motivation in preschoolers. Doctoral dissertation, University of Missouri – Columbia. Stone, D. N. (1995). The joint effects of DSS feedback and users’ expectations on decision processes and performance. Journal of Information Systems, 9(1), 23–41. Sugar, W. A. (2001). What is so good about user-centered design? Documenting the effect of usability sessions on novice software designers. Journal of Research on Computing in Education, 33(3), 235–251. Svenson, O. (1979). Process descriptions of decision making. Organizational Behavior and Human Decision Processes, 23(1), 86–112. Tang, S.-H., & Hall, V. C. (1995). The overjustification effect: A meta-analysis. Applied Cognitive Psychology, 9, 365–404. Todd, P., & Benbasat, I. (1994a). The influence of decision aids on choice strategies. Organizational Behavior and Human Decision Processes, 60, 36–74. Todd, P., & Benbasat, I. (1994b). The influence of decision aids on choice strategies under conditions of high cognitive load. IEEE Transactions on Systems, Man, and Cybernetics, 24(4), 537–547. Tuttle, B., & Stocks, M. H. (1997). The effects of task information and outcome feedback on individuals’ insight into their decision models. Decision Sciences, 28(2), 421–442. Tuttle, B., & Stocks, M. H. (1998). The use of outcome feedback and task property information by subjects with accounting-domain knowledge to predict financial distress. Behavioral Research in Accounting, 10, 76–108.
Effects of Feedback and Reward on Decision Performance
143
Tversky, A. (1972). Elimination by aspects: A theory of choice. Psychological Review, 79(4), 281–299. Tzeng, J. (2004). Toward a more civilized design: Studying the effects of computers that apologize. International Journal of Human–Computer Studies, 61(3), 319–345. Wilson, R. L. (1978). The effect of reward on intrinsic motivation: An integration of dissonance and intrinsic motivation studies. Doctoral dissertation, North Carolina State University. Young, F. W. (1972). A model for polynomial conjoint analysis algorithms. In: R. N. Shepard, A. K. Romney & S. Nerlove (Eds), Multidimensional scaling: Theory and applications in the behavioral sciences (Vol. 1, pp. 69–104). New York: Academic Press.
A LONGITUDINAL STUDY OF NEW STAFF AUDITORS’ INITIAL EXPECTATIONS, EXPERIENCES, AND SUBSEQUENT JOB PERCEPTIONS Heather M. Hermanson, Mary C. Hill and Susan H. Ivancevich ABSTRACT Prior research has found that staff accountants may be disappointed when their initial work expectations do not match their early work experiences and this disappointment can lead to negative job outcomes (AAA, 1993; Dean, Ferris, & Konstans, 1988; Carcello, Copeland, Hermanson, & Turner, 1991; Padget, Paulson, Hughes, Hughes, & Ernst and Young LLP, 2005). This paper reports information obtained from the staff auditors about their initial expectations on a variety of work factors, early work experiences related to those factors, and subsequent perceptions of the factors. Similar to prior research, the results show the new accountants had high initial expectations about the public accounting work environment and that their subsequent job perceptions were lower than their initial expectations. Explanations for the declines were not obvious, as many of the changes in perceptions were not significantly Advances in Accounting Behavioral Research, Volume 12, 145–183 Copyright r 2009 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1108/S1475-1488(2009)0000012009
145
146
HEATHER M. HERMANSON ET AL.
related to relevant work experiences. Given the decrease in job perceptions over time on a variety of factors, the results indicate that a gap exists between the initial work expectations of the new accountants and the work environment that they encounter during their first 18 months of employment. This gap is important because prior research indicates when employees have unmet expectations they have less positive job attitudes and behaviors (Padget et al., 2005; Dean et al., 1988). Further, this gap exists in spite of firms’ efforts to increase communication with students via web sites, internships, and visits to college campuses, and efforts to improve the work environment (e.g., flexible work schedules, compressed workweeks, telecommuting, etc.).
INTRODUCTION Accountants just beginning their careers may have overly optimistic expectations of the professional accounting workplace. Prior research indicates that staff accountants’ perceptions of their work environ ment typically decline over the early years of work experience (AAA, 1993; Dean, Ferris, & Konstans, 1988; Carcello, Copeland, Hermanson, & Turner, 1991). These negative changes in perception can be related to a variety of factors, including levels of overtime, pay raises, client assignments, travel time, or the nature of their supervision (Marxen, 1996; Hiltebeitel, Leauby, & Larkin, 2000). Declines in perceptions about the workplace due to unmet expectations have been shown to lead to lower job satisfaction and increased turnover (Padgett, Gjerde, Hughes, & Born, 2005; Dean et al., 1988). This research reports results from a longitudinal study that tracked new staff accountants’ experiences and perceptions over their first 18 months of employment with a large accounting firm. Data were gathered when the employees began work and then at three 6-month intervals thereafter. The initial instrument gathered demographic data and the participants’ initial expectations of the work environment on a variety of factors such as expected challenge, expected enjoyment of the work, expected flextime opportunities, and expected work hours. The subsequent instruments gathered experience data such as hours worked, hours of training received, number and variety of clients, raises, and promotions. The subsequent instruments also gathered data about the new staff accountants’ job perceptions using a variety of questions regarding the same factors addressed on the initial instrument.
A Longitudinal Study of New Staff Auditors’
147
Thus, the data revealed ‘‘before’’ opinions and ‘‘after’’ opinions based on actual work experiences. This research is modeled after and extends the previous studies (e.g., Dean et al., 1988; Carcello et al., 1991; Hiltebeitel et al., 2000) that examine whether new accountants’ opinions on a variety of factors decline over time. This study follows the previous research (e.g., Carcello et al., 1991) to examine whether accountants’ perceptions are still trending downward in spite of the more employee-friendly policies accounting firms have implemented (e.g., flextime, compressed workweeks, telecommuting, etc). It extends the previous studies by capturing data related to actual work experiences in the same period in which the perception data were collected. Finally, this study extends prior research by using a new generation of workers that may have different initial expectations of the workplace than those workers who participated in prior research. Turnover has been an acknowledged problem for public accounting firms from as early as 1971 (Trump & Hendrickson, 1971), yet it persists in the accounting workplace (Hiltebeitel et al., 2000; Pasewark & Viator, 2006). When employee expectations are not met, employees have more negative job attitudes and behaviors, including higher turnover (Padgett et al., 2005; Dean et al., 1988). Thus, this paper should be of interest to practitioners who desire an understanding of changes in entry-level accountants’ perceptions in the workplace. The descriptive data on the new accountants’ experiences (such as hours worked, commute times, training hours) and their affect on perceptions can inform firms on which experiences are trouble spots that decrease new staff accountants’ perceptions. Firms can use these data to decide whether to address these trouble spots with new policies or motivational efforts or at least be aware of experiences that may be problematic to their efforts to enhance retention. Prior research has found a disparity between perceptions of accounting professors and students with respect to the experiences of new staff accountants (DeZoort, Lord, & Cargile, 1997). Thus, this study should be of interest to professors as it provides data that they can use to shape the expectations of students with respect to positions in public accounting. Research has also reported differences between what students believe and what new accountants believe about the experiences of new staff accountants (Carcello et al., 1991). Therefore, the data may help new accountants be better prepared to enter the workforce. The next section presents a literature review and research questions and is followed by data collection, data analyses, limitations, and discussion of the results.
148
HEATHER M. HERMANSON ET AL.
LITERATURE REVIEW AND RESEARCH QUESTIONS Public accountant job satisfaction and turnover have been the subject of extensive previous research and are naturally interrelated (e.g., job satisfaction affects turnover). Multiple factors such as an employee’s personal characteristics, employer characteristics, and the nature of the work itself impact satisfaction and turnover and increase the difficulty of understanding the relationships between the various employee, employer, and work factors. Employees’ initial expectations about factors in their work environment and their subsequent evaluation of the factors in that environment (e.g., expectation about overtime hours versus experienced overtime hours) will also affect job satisfaction and turnover (Padgett et al., 2005; Dean et al., 1988). This study examines research questions on changes in perceptions on a variety of work factors that prior research has found to influence job satisfaction and turnover. Prior research has found that when the work environment meets an accountant’s expectations, the accountant has more positive job perceptions and remains with the firm longer (Padgett et al., 2005; Dean et al., 1988). This study follows previous literature that has examined changes in accountants’ perceptions about their work environment (Dean et al., 1988; Reed & Kratchman, 1989; Carcello et al., 1991; Marxen, 1996; Hiltebeitel et al., 2000; Padgett et al., 2005). These studies have consistently found, using a variety of participants, factors, and research methods, that employees’ opinions of the work environment decline over time. This decline has been labeled ‘‘occupational reality shock (Dean et al., 1988). Padgett et al. (2005) suggest that employees with little prior work experience will be more likely to experience occupational reality shock. This study extends the prior research in two ways. First, the prior research on changes in job perceptions has provided only opinion data. This study adds descriptive data on the work experiences of the new staff preceding the measures of the job perceptions. Second, this study adds more recent data about changes in accountants’ perceptions. New data are important as the characteristics of the population change because research has found generational differences in work expectations. For example, baby boomers, who were born between 1946 and 1964, generally expected their work environment to provide personal growth and financial rewards (Hazard, 2007). Generation X workers, who were born between 1965 and 1975, expect employers to help them develop skills and meet their personal needs (Hazard, 2007). Finally, Generation Y workers, who were born in 1976 or later1 (Hazard, 2007) grew up with computer technology, video games, and
A Longitudinal Study of New Staff Auditors’
149
multitasking (Bell & Narz, 2007; Walmsley, 2007). Therefore, they typically display shorter attention spans and expect more workplace challenges. Given their comfort with technology, Generation Y workers, expect flexibility and telecommuting options (Bell & Narz, 2007; Walmsley, 2007). The participants in the studies mentioned previously were of earlier generational cohorts, while this study’s participants were primarily (83%) Generation Y workers. Thus, based upon their generation’s noted characteristics, the participants in this study may have expectations of the workplace that are inherently different from the previously studied groups (Walmsley, 2007). Along with changes in perceptions, prior researchers have addressed multiple other factors that affect job satisfaction and turnover.2 These factors include personal characteristics, workplace characteristics, the work itself, and economic alternatives. Some of the personal characteristics that have been found to affect job satisfaction are educational level, career intentions, gender, age (Harrell, 1990; Moyes, Williams, & Koch, 2006), A/B personality type (Rasch & Harrell, 1990; Fisher, 2001) marital status, parenthood, and work versus family balance (Bernardi & Hooks, 2001; Dalton, Hill, & Ramsay, 1997). Workplace characteristics include the office size (Gaertner & Ruhe, 1981), structure of the audit firm and audit methods (Bamber, Snowball, & Tubbs, 1989; Dalton et al., 1997), the offering of flexible work hours (Almer & Kaplan, 2002; Padgett et al., 2005), coworkers, supervision and mentoring programs (Patten, 1995; Kaplan, Keinath, & Walo, 2001; Padgett et al., 2005), and compensation (Sweeney & McFarlin, 2005). The work itself includes the amount of challenge, variety, and number of hours worked (Sweeney & Summers, 2002). All of these factors interact with each other; for example, amount of overtime worked may have a more negative impact on job satisfaction as an employee’s personal responsibilities increase. This study examines changes in new staff accountant perceptions on workplace characteristics and the work itself. The examination is limited to these factors because they are to some extent controllable by the accounting firm (e.g., offering flexible work hours, providing mentors, limiting the number of hours worked).
Work–Life Balance The first factor examined in this study is work and family life balance. Individuals are required to divide their time between work and nonwork, or
150
HEATHER M. HERMANSON ET AL.
work and leisure. This necessary allocation is often labeled work–life balance and a work versus family life conflict is presumed to exist when the work and nonwork situations are incompatible in some respect (Padgett et al., 2005). One form of this work versus personal life conflict results from the number of hours worked (Gaertner & Ruhe, 1981; Hiltebeitel et al., 2000; Dalton et al., 1997). Public accounting is a service industry with an environment that often includes inflexible deadlines and constrained work budgets, which can frequently require accounting professionals to work overtime (Sweeney & Summers, 2002). Given this environment, accountants’ work versus family life, or work versus leisure time, conflict is generally high (Padgett et al., 2005). This conflict may be especially high for Generation Y workers who want to have a good personal life and are less likely to tolerate long hours than previous generations (Hazard, 2007; Walmsley, 2007). Other factors that affect work versus family life balance are where the work is performed (e.g., client site in town, client site out of town, firm office, employee home) and when the work is performed (i.e., traditional work hours versus flexible work hours) (Padgett et al., 2005; Almer & Kaplan, 2002). Flexible work arrangements such as alternative schedules (e.g., four 10-hour days per week), reduced schedules, and work at home options are found to be important to the accountants’ job satisfaction (Almer & Kaplan, 2002). For Generation Y, flexible work arrangements may even be more important than in previous generations, because Generation Y workers are so technologically adept that telecommuting may be expected (Trunk, 2007; Bell & Narz, 2007; Gerdes & Broden, 2007). This study addresses work versus family life balance for new staff accountants with these research questions: RQ1A. Do the regular hours and busy season hours worked by new staff accountants meet or exceed the regular hours and busy season hours they expected to work when they were first hired? RQ1B. Do the travel requirements of new staff accountants meet or exceed the travel requirements they expected when they were first hired? RQ1C. Do the new staff accountants’ perceptions of flextime options meet or exceed the flextime options they expected when they were first hired?
A Longitudinal Study of New Staff Auditors’
151
RQ1D. Do the new staff accountants’ perceptions of work at home options meet or exceed the work at home options they expected when they were first hired? RQ1E. Do the new staff accountants’ perceptions of their ability to balance work requirements and personal activities meet or exceed the work–life balance they expected when they were first hired? Training The next research question address new accountant training. Historically, students have cited training as one of the reasons for selecting public accounting for their initial accounting work experience (Peterson & Devlin, 1998). Marxen (1996) cites training as one of the factors that accountants liked most about public accounting. However, Hermanson, Hill, and Ivancevich (2002), using newer data, report training ranked third from the bottom of 13 possible reasons for selecting an accounting firm. Regardless of whether training is an important reason for selecting a firm, Carcello et al. (1991) report that new staff accountant’s opinions on the training they received were less positive than the expectations regarding training expressed by students. Training may be an even bigger issue for Generation Y employees because they want and expect challenge and professional development in the workplace (Bell & Narz, 2007). This research addresses the following question: RQ2. Does the perception of the adequacy of training received by new staff accountants meet or exceed the extent of training they expected when they were first hired? Challenge and Variety Other factors that affect job satisfaction are the variety and challenge of the work being performed (Hiltebeitel et al., 2000). Marxen (1996) studied alumni of public accounting firms. Of that study’s participants, 44% reported variety and 29% reported challenge as attributes they liked about public accounting. Marxen further reported that many accountants selected public accounting to obtain experience working with a variety of companies in a variety of industries. Carcello et al. (1991) reported their participants most frequently gave diversity of job assignments as the liking public
152
HEATHER M. HERMANSON ET AL.
accounting. Hiltebeitel et al. (2000) surveyed new accountants on 23 factors related to job satisfaction and found the new accountants were most positive about their work giving them ability to take charge and use analytical and critical thinking skills. Variety and challenge may also be important for the participants in this study as Generation Y workers are noted for wanting autonomy, challenge, and variety in their work activities (Trunk, 2007; Bell & Narz, 2007; Sridharan, 2007). Thus, the next research questions address these issues. RQ3A. Do the new staff accountants’ perceptions of the challenge of their work meet or exceed the level of challenge they expected to encounter when they were first hired? RQ3B. Do the new staff accountants’ perceptions of the variety of their work meet or exceed the work variety they expected when they were first hired?
Worker Relationships Coworkers, both peers and supervisors, can significantly affect job satisfaction. This factor may be particularly true for Generation Y workers as friendship is a strong motivator for them to join or remain with a firm (Trunk, 2007; Gerdes & Broden, 2007). Previous researchers have related supervisory actions, such as mentoring, to job satisfaction (Patten, 1995; Stallworth, 2003). Mentors have also been related to compensation and promotion (Padgett et al., 2005). However, some research suggests that lowerlevel accountants are less likely to have a mentor than more experienced accountants (Scandura & Viator, 1994; Viator, 1999). Hiltebeitel et al. (2000) found that supervision was the job-related factor about which new accountants’ opinions were most likely to decline as a consequence of work experience; the five most significant declines in perceptions out of 23 jobrelated questions pertained to supervisory actions. Mentoring and supervision may be particularly important to Generation Y employees, as they tend to seek more feedback on their performance (Sridharan, 2007; Hazard, 2007; Gerdes & Broden, 2007). The next research questions, then, address relationships between the new staff accountants and their coworkers and superiors.
A Longitudinal Study of New Staff Auditors’
153
RQ4A. Do the new staff accountants’ perceptions of the friendliness of their coworkers meet or exceed the level of friendliness they expected to find in their coworkers when they were first hired? RQ4B. Do the new staff accountants’ perceptions of the adequacy of supervisory mentoring received meet or exceed the expectations they had when they were first hired of the mentoring that would be provided?
Raises and Promotions Raises and promotions are also significantly related to job satisfaction (Sweeney & McFarlin, 2005). Hermanson et al. (2002) found that new accountants rated salary and job advancement opportunities second only to firm culture and reputation as reasons for selecting their job. Peterson and Devlin (1998) reported that graduating seniors in the United States cited ‘‘opportunity for advancement’’ as the number one criterion used for selecting their first job. The market for accounting new hires was at the time of this study, and still is, extremely competitive for employers (Schroeder & Reichardt, 2003). This marketplace led to rising salaries and even signing bonuses for new accountants (Gerdes & Broden, 2007; Robert Half International, 2007). Thus, the expectations with respect to promotions and salaries currently may be very high and the new accountants may perceive they have multiple alternative job opportunities. Research has found that pay comparison (either internal to an organization or external to an organization) can affect job satisfaction (Sweeney & McFarlin, 2005). A final pressure on expectations regarding raises and promotions is that Generation Y individuals have grown up at a time when incentives were embedded into their culture (Sridharan, 2007). This study addresses these research questions on raises and promotions: RQ5A. Do the new staff accountants’ perceptions of the raises they received meet or exceed the expectations they had when they were hired of the raises they would receive? RQ5B. Do the new staff accountants’ perceptions of the promotions they received meet or exceed their initial expectations of the promotions they would receive?
154
HEATHER M. HERMANSON ET AL.
DATA COLLECTION, ANALYSIS, AND RESULTS Sample One international accounting firm agreed to sponsor this project by permitting the authors to contact staff and invite them to participate. The firm provided a list of new staff accountants that started work in July, August, September, or October of 2000 in the United States of America. From this list, 70% of the new accountants were randomly selected, resulting in a sample of 325 participants.3 The sample included respondents from all geographic regions of the country. The participants completed four survey instruments (over 18 months) that were mailed directly to their home addresses as supplied by the accounting firm. The first instrument collected data on initial recruiting, job expectations, and demographics (the original instrument was eight pages long) and was mailed to all participants in November of the year of employment.4 Three subsequent instruments, mailed at 6-month intervals thereafter, collected data on changes in demographic status (e.g., had they moved or had a child?), detailed information about their job experiences for the last 6-month period, their opinions about a variety of work factors, and their current job satisfaction (these original instruments were seven pages long). Participants were asked to use a tracking code based on the last 5-digits of their social security number so that instruments from the same participant could be matched and compared from survey to survey. Participants were sent one follow-up letter if they had not responded to an instrument after approximately 2 months. Participants only received a subsequent instrument if they completed the immediately preceding instrument. All data were self-reported, although the participants were asked to consult their time sheets for work activity data (e.g., client data and hours worked). Table 1, Panel A presents a summary of the data collection effort and response rates.5 The response rates are high for a mail survey, particularly a mail survey that used such lengthy questionnaires (Alreck & Settle, 1985). The instruments, adapted and shortened to contain only the survey questions relevant to this study, are shown in Appendices A and B.
Demographic Description of Sample Table 1, Panel B provides the demographic information supplied by the participants. Demographic data were provided on the initial instrument.
155
A Longitudinal Study of New Staff Auditors’
Table 1. Initial Survey Panel A: Data collection Surveys maileda Returned by post office Responses Response rate Panel B: Demographic statistics Responses
Participants. First 6-Month Second 6-Month Third 6-Month Follow-Up Follow-Up Follow-Up
325 1 185 57.10%
185 4 99 54.70%
99 3 47 48.96%
47 2 32 71.11%
185
99
47
32
Continuous variables Average age Average GPA Average SAT
23.72 3.50 1198
24.16 3.51 1185
24.28 3.55 1220
24.74 3.58 1231
Categorical variables (%) Female White Single Accounting majors Intern With no work experience Homeowner
57.30 79.89 84.86 78.38 67.39 70.27 8.15
62.63 79.59 77.78 81.82 60.61 66.67 10.10
68.09 80.85 76.60 85.11 70.21 65.96 14.89
62.50 84.38 81.25 84.38 68.75 71.88 15.63
a
After the initial mailing, a subsequent instrument was only mailed to those participants who responded to the immediately preceding survey.
The demographic data obtained in the initial instrument indicate that the participants are young, academically skilled, accounting majors with little work experience. A large majority (83%) of the participants were born in 1976 or after, making them Generation Y workers. They are mostly single (84.86%), slightly more than half are female (57.30%), and are living either in an apartment (52%) or at home with their parents (35%). The majority of the respondents participated in an internship program (67%). The table also presents these demographic factors at each subsequent measurement point for the participants who responded to the later instruments. These data were used to examine whether the respondents were withdrawing from the study in a systematic pattern. The continuous demographic variables were tested using t-tests for differences in the participants who completed each subsequent instrument. These demographic
156
HEATHER M. HERMANSON ET AL.
characteristics of individuals in the sample at each measurement date were not significantly different from the demographic characteristics of individuals in the sample at other measurement dates. The results were similar for the categorical demographic variables, which were evaluated using chi-square tests. Thus, respondents did not withdraw from the study based on a demographic factor such as age, intelligence, or gender.
Data Analysis The data analysis for each job satisfaction factor consists of three parts. The first part is an aggregation of descriptive data about the actual work experiences of the participants that would naturally be related to the factor. For example, the experience of the hours worked would relate to the ability to balance work and family life, the number of training hours attended would relate to training satisfaction, the number of clients would relate to the opinion on the variety in the work, and so forth. The second part consists of statistical comparisons of the changes within a participant over time; that is, a paired comparison looking at the participants’ before and after observations. Each test is conducted comparing the individual participant’s opinion at the later survey time to his/her expectation at initial hiring, that is, 6 months later versus initial expectation, 12 months later versus initial expectation, and 18 months later versus initial expectation. The changes of each individual participant are measured and only then are these differences grouped for statistical analysis. The mean value of the initial expectations differs in each period of the analysis because the mean only includes the observations from those participants who completed a later instrument. Given the results from prior research that show consistent declines in opinions (Dean et al., 1988; Carcello et al., 1991), one-way comparisons are conducted expecting lower opinions held at the later time than at initial hiring. The final part of the analysis uses the descriptive work experience data (e.g., hours worked, training hours, commute times, amount of raise) relevant to a particular opinion to determine if the work experience was related to the opinion. Thus, the opinion on training is regressed against training hours, the opinion on salary is regressed against the actual increase in salary, and the opinion on work–life balance is regressed against work hours and travel time. Separate models were run containing the data from each of the three follow-up instruments.
A Longitudinal Study of New Staff Auditors’
157
Work–Life Balance (RQ1A–RQ1E) The first set of research questions addresses activities and perceptions related to work versus family life balance.6 Data related to these questions are presented in Table 2. Table 2, Panel A presents data on reported hours worked. The accountants reported average actual hours of 45.32, 40.13, and 47.28 h per week in the three 6-month survey periods. The instruments were sent at 6-month intervals because it was impractical to conduct and repeat a mail survey within a shorter time interval. The intervals were from December 1 to May 31, June 1 to November 30, and again December 1 to May 31. For this study, the December 1–May 31 is the ‘‘busy season’’ sample period. Defining busy season during that time is not unreasonable given audit preparation work and follow-up, but some might define busy season more narrowly to include only January, February, and March (Sweeney & Summers, 2002).7 In order to compare these data with prior research, the average weekly hours over the 6-month period are converted to a 3-month busy season average by assuming that respondents worked 40 h per week outside of the 3 months of the more narrowly defined busy season. This assumption is consistent with the results from the third measurement date (which did not include the busy season period) showing participants worked on average 40 h per week. This conversion results in an estimate of busy season hours worked of approximately 51 h per week in the first busy season and 55 h in the second busy season. The first survey, which was conducted in November, requested opinion data on the number of hours the participants expected to work during busy season and nonbusy season. Table 2, Panel A presents the result that new hires initially expected to work approximately 59 h per week during busy season and 42 h per week outside of busy season. The differences using the 6-month busy season definition are presented and the initial expectations were statistically significantly higher than the actual hours that the participants reported working. Adjusting for the shorter busy season definition, the expected hours are still greater than those the participants report working, but with a nonsignificant difference. Therefore, for RQ1A, the accountants’ expectations of their work hours were higher than the actual hours worked. Table 2, Panel B presents data on travel. The accountants reported their commute time to clients and the number of clients that required an overnight stay. The longer the commute and the more clients out of town would be theorized to increase work versus personal life stress. The average
158
HEATHER M. HERMANSON ET AL.
Table 2. Work–Life Balance. Panel A: Expected Work Hours versus Actual Work Hours (RQ1A) Time period
Mean (standard deviation)
Sample sizea
Difference p-value
Reported hours per Expected work hours week (estimated per weekb 3-month busy season hours per week) First 6 months (busy season)
99
Second 6 months (nonbusy season)
47
Third 6 months (busy season)
32
45.32 (5.27) 50.64 40.13 (4.84) NA 47.28 (3.80) 54.56
59.11 (6.07) 42.12 (3.73) 59.18 (6.55)
13.79
.0001
8.47 1.99
.0831 .0264
11.90
NA .0001
4.32
.2571
Panel B: Client Travel Analysis Time period
Sample size
Mean (standard deviation)
Estimated % of time out of town
Commute Clients out Hours Total client time of town per client hours First 6 months
99a
Second 6 months
47
Third 6 months
32
44.02 (20.61) 39.51 (21.45) 36.89 (17.23)
1 (1.45) 1.04 (1.46) 1.23 (1.41)
141.15 (143.43) 127.29 (88.13) 161.24 (78.06)
976.43 (137.13) 766.70 (133.48) 1055.77 (124.37)
14.5% 17.3% 18.9%
Panel C: Change in Opinion on Travel (RQ1B) Time period
Sample size
Estimated % of time out of town
Mean (standard deviation)
Difference p-value
Expected time out of townb First 6 months
99
14.5%
Second 6 months
47
17.3%
Third 6 months
32
18.9%
19.26 (16.12) 18.77 (17.56) 19.59 (18.61)
4.76
.3842
1.47
.4668
.69
.4853
159
A Longitudinal Study of New Staff Auditors’
Table 2. (Continued ) Panel D: Perception of Flextime (RQ1C) Time period
Sample sizea
Mean (standard deviation)
Difference p-value b
Subsequent perception: There Initial expectation: are adequate opportunities I expect flextime for flextime scheduling scheduling or customized work (1 ¼ strongly arrangements (1 ¼ strongly disagree; 7 ¼ disagree; 7 ¼ strongly agree) strongly agree) First 6 months
99
Second 6 months
47
Third 6 months
32
3.88 (1.77) 4.18 (1.82) 3.74 (1.20)
4.58 (1.57) 4.40 (1.51) 4.59 (1.50)
0.70
.0056
0.22
.2252
0.85
.0095
Panel E: Perception of Work at Home Opportunities (RQ1D) Time period
Sample sizea
Mean (standard deviation)
Difference p-value
Subsequent perception: Initial expectation:b I I am able to work at expect to be able to work home when possible at home when possible (1 ¼ strongly disagree; (1 ¼ strongly disagree; 7 ¼ strongly agree) 7 ¼ strongly agree) First 6 months
99
Second 6 months
47
Third 6 months
32
3.22 (1.77) 3.51 (1.70) 3.67 (1.54)
4.21 (1.75) 4.45 (1.79) 4.50 (1.68)
0.99
.0001
0.94
.0066
0.83
.0132
Panel F: Opinion of Work–Life Balance (RQ1E) Time period
Sample sizea
Mean (standard deviation)
Difference p-value
Initial expectation:b I Subsequent perception: I am able to balance my expect a reasonable work and family life balance of work and family (1 ¼ strongly disagree; activities (1 ¼ strongly 7 ¼ strongly disagree; 7 ¼ strongly agree) agree) First 6 months
99
Second 6 months
47
Third 6 months
32
4.51 (1.45) 4.48 (1.41) 3.95 (1.38)
5.56 (1.45) 5.74 (1.33) 5.56 (1.37)
1.05
.0001
1.26
.0001
1.61
.0001
160
HEATHER M. HERMANSON ET AL.
Table 2. (Continued ) Panel G: Models of Work–Life Balance Opinion Time period
Sample sizec
Overall statistics Significance R2
First 6 months (busy season) Second 6 months (nonbusy season) Third 6 months (busy season)
81 41 30
.0168 .1235 .0246 .2214
t-statistic ( p-value) Total hours
Average commute
1.98 2.66 (.0516) (.0096) 1.37 2.94 (.1779) (.0056) Not significant
Clients out of town 1.36 (.1791) 1.72 (.0936)
a
For all tables, the sample size is based on the total number of respondents, for most responses the effective sample size is slightly smaller due to some respondents with missing data. In most cases, 1–3 instruments had missing data. b Includes the initial response for only those participants who also completed the later instrument. c The sample size is smaller for the models than for the individual tests as the model requires nonmissing data for all variables.
commute time starts at 44 min and decreases each period to an average of 37 min. The new accountants report on average one out-of-town client each 6-month period. The percentage of time on out of town on client work is estimated by multiplying the average number of clients out of town times the average hours per client and then dividing by the total client hours. The estimated percentage of time out of town in each 6-month interval is compared to an initial expectation of travel time. Table 2, Panel C shows a nonsignificant difference in the actual amount of travel compared to the expected amount of travel. Therefore, for RQ1B, the accountants’ expectations of their travel were fairly accurate. The participants responded to two questions on flexible work arrangements. With respect to flextime, the staff accountants’ opinions regarding opportunities for flexible work schedules were lower than their initial expectations in the two 6-month periods that included busy season (see Table 2, Panel D). The participants also responded to an opinion question about the potential to work at home. Table 2, Panel E shows that the participants’ perceptions regarding the potential to work at home were consistently significantly lower than their initial expectations of the ability to work at home. RQ1C and RQ1D answered that accountants’
A Longitudinal Study of New Staff Auditors’
161
expectations of flextime and work at home were higher than their opinion after working each period except for flextime during nonbusy season. Table 2, Panel F presents the participants’ opinions of their ability to balance work and family life. For each instrument, the participants reported their current perception of the ability to balance their work and family as lower than their initial expectations (see Table 2, Panel F). The greatest drop from initial expectations was during the fourth survey period when the new accountants reported working on average 47.28 h per week over the 6-month period. Thus, for RQ1E, the accountants’ expectations of their ability to balance work and family life were higher than their opinion after working. Three final analyses on work versus family balance relate the actual work experiences to the changes in opinion on ability to balance work and family life. Regression models were run for each survey period using the current opinion on ability to balance work and family life as the dependent variable. The independent variables were total hours worked in the period, average commute to the client in the period, and the number of clients out of town in the period. The results are shown in Table 2, Panel G. Only the first two models were significant. For these two models, the only significant variable was the commute time; as commute time went up, the ability to balance work and life went down. To summarize, the new accountants’ initial expectations on busy season work hours, nonbusy season work hours, and percent of time out of town appear to be realistic and in fact exceeded the respondents’ actual work experiences. In spite of work hours and travel that were not as strenuous as expected, the accountants’ subsequent perceptions of their ability to balance work versus family life were lower than their initial expectations. The accountants’ perceptions were also lower than their initial expectations on flextime and work at home options. One intuitive finding resulted from regressing the opinion on ability to balance work and family life against actual experience data was that when commute times increases, the perception of the ability to balance work and family life decreases.
Training (RQ2) The next research question addresses the participants’ experiences and opinions with respect to training. In the first 6 months of employment, the participants received an average of 33.14 h of training. During the next
162
HEATHER M. HERMANSON ET AL.
Table 3.
Training.
Panel A: Training Hours Time period First 6 months Second 6 months Third 6 months Total average training hours
Sample sizea
Mean (standard deviation)
99 47 32
33.14 (25.51) 43.94 (28.53) 19.68 (15.06) 96.76
Panel B: Perception of Adequacy of Training (RQ2) Time period
Sample size
First 6 months
99
Second 6 months
47
Third 6 months
32
Mean (standard deviation)
Difference p-value
Subsequent Initial expectation:b I expect extensive perception: I have training (1 ¼ strongly received adequate training (1 ¼ strongly disagree; 7 ¼ strongly agree) disagree; 7 ¼ strongly agree) 4.79 (1.34) 4.85 (1.37) 4.50 (1.34)
5.37 (1.31) 5.36 (1.24) 5.25 (1.19)
0.58
.0010
.53
.0170
.75
.0110
a
For all tables, the sample size is based on the total number of respondents, for most responses the effective sample size is slightly smaller due to some respondents with missing data. In most cases, 1–3 instruments had missing data. b Includes the initial response for only those participants who also completed the later instrument.
6-month period that included nonbusy season, the staff accountants received an average of 43.94 training hours. During the last 6-month period (which included busy season), the average training received was 19.68 h. In summary, staff auditors receive on average 96.72 h or approximately 12 days of training in the first 18 months of employment. These experiences are summarized in Table 3, Panel A. Table 3, Panel B reports the perceptions of training of the new staff accountant. For each 6-month period, the accountants’ perceptions of the actual training received were significantly lower than their initial expectations on training (see Table 3, Panel B). However, the wording of the questions on training may have influenced these results; the wording on the initial expectation referred to ‘‘extensive training,’’ while the follow-up
A Longitudinal Study of New Staff Auditors’
163
question asked about ‘‘adequate training.’’ The wording difference makes a comparison of the before and after opinion problematic; however, in spite of asking for a weaker evaluation (adequate being less than extensive) the opinion on training went down. The final analyses regressed the opinion on adequate training in each 6-month period against the training hours in that period. These models find no significant relationship between the perception of the adequacy of training and the training hours in any of the three 6-month periods. To answer RQ2, the perception of the adequacy of training was lower than the initial expectation of extensive training. Interestingly, the opinion on training was not related to the actual number of training hours, which suggests the new accountants evaluate training based on content or relevance to their work, rather than time spent in classes. Challenge and Variety (RQ3A and RQ3B) Auditors can experience variety and challenge in their work through the number and type of clients or the type of work performed. Thus, descriptive data on variety include: average number of clients worked with during a 6-month period, average number of industries represented by the clients, and percentage of work that was audit as compared to other activities such as compilation or review. Table 4, Panel A reports the descriptive results on variety of work. The auditors averaged 8.07 clients during their 6-month period including their first busy season, 7.22 during the nonbusy season, and 7.19 during the last 6-month period (which included their second busy season). After their first 6 months, the auditors averaged 4.39 new clients in the next 6 months and 2.9 new clients during their third 6-month period. Using the first period when all of the clients were new plus the new clients in each period shows that the auditors worked for an average of 15.39 different clients during the first 18 months of employment. Another characteristic that can contribute to variety is the number of different industries of the auditors’ clients. These auditors worked on clients representing an average of 3.23 industries during their first 6 months, 3.15 industries during the nonbusy season, and 2.71 industries during their third 6 months. Finally, variety can be affected by the work activity itself. The work activity was overwhelmingly audit; 83% of the work being audit during their first 6 months, 70% of the work being audit during their second 6 months, and 80% of the work being audit during third 6 months.
164
HEATHER M. HERMANSON ET AL.
Table 4.
Variety and Challenge.
Panel A: Analysis of Client Engagements Time period
First 6 months Second 6 months Third 6 months
Mean (standard deviation)
Sample sizea
99 47 32
Number of clients
Number of new clients
Number of industries
Percent of work that was audit
8.07 (3.68) 7.22 (3.42) 7.19 (2.99)
8.07 (3.68) 4.39 (2.91) 2.90 (2.90)
3.23 (1.55) 3.15 (1.63) 2.71 (1.19)
.83 (.21) .70 (.24) .80 (.25)
Panel B: Perception of Variety of Work (RQ3A) Time period
Sample size
First 6 months
99
Second 6 months
47
Third 6 months
32
Mean (standard deviation)
Difference p-value
Subsequent perception: Initial expectation:b I experience variety in my I expect variety in my work assignments work assignments (1 ¼ strongly disagree; (1 ¼ strongly disagree; 7 ¼ strongly agree) 7 ¼ strongly agree) 5.09 (1.25) 4.98 (1.15) 5.13 (1.13)
5.88 (1.03) 5.64 (1.15) 5.78 (1.04)
.78
.0001
.66
.0019
.33
.0058
Panel C: Perception of Challenge of Work (RQ3B) Time period
Sample sizea
Mean (standard deviation)
Difference p-value b
Subsequent Initial expectation: I expect the work perception: The tasks to be challenging I perform are challenging (1 ¼ strongly disagree; (1 ¼ strongly disagree; 7 ¼ strongly agree) 7 ¼ strongly agree) First 6 months
99
Second 6 months
47
Third 6 months
32
a
4.88 (1.00) 5.12 (1.01) 5.47 (.84)
6.09 (.81) 6.11 (.70) 6.13 (.71)
1.21
.0001
0.99
.0001
0.66
.0008
For all tables, the sample size is based on the total number of respondents, for most responses the effective sample size is slightly smaller due to some respondents with missing data. In most cases, 1–3 instruments had missing data. b Includes the initial response for only those participants who also completed the later instrument.
A Longitudinal Study of New Staff Auditors’
165
Table 4, Panels B and C report the auditors’ perceptions of the variety and challenge in their work activities. For both RQ3A and RQ3B, the initial expectations were significantly higher than the perceptions reported after working. One interesting result is that while the opinion on challenge did not meet the initial expectation, it still went up over time. To ensure the results were not an artifact of self-selection bias, (i.e., the new accountants who did not feel challenged did not respond to the follow-up instruments), the results on challenge were evaluated for only the 32 participants who completed all four instruments. While the participants still did not report challenge as high as their initial expectation, their opinion on the challenge in their work did increase beyond the 6-month point, initial expectation 6.13, after 6 months 5.06, after 12 months 5.30, and after 18 months 5.47. Models that regressed the participants’ opinions on variety and challenge to number of clients, number of industries, and percentage of work that was audit were not significant. To summarize, the participants reported working with an average of 7–8 clients and working in an average of three different industries during each 6-month interval. These client experiences would seem to provide quite a bit of variety and challenge for a new staff accountant. Yet, the perceptions of variety and challenge of their work activities all dropped significantly from the initial expectations. The downward changes in opinions cannot be related to the number of clients or the number of industries or the percent of work that was audit. These results raise the question of how the participants evaluate variety or challenge if not based on clients, industries, or work activity. It could be that these Generation Y employees may always be disappointed with their job at first because the nature of being new at a job means starting with simpler tasks, less control, and less responsibility.
Worker Relationships (RQ4A and RQ4B) The next analyses pertain to worker relationships. The participants report on the initial survey instrument that they knew on average 7.5 people out of a starting ‘‘class’’ that averaged 34.69 new staff accountants, or on average 27% of their peers when they started work. The next question addresses the participants’ before and after perceptions of the friendliness of coworkers. Table 5, Panel A presents results on coworker relationships. Overall, the new staff auditors rate the friendliness of their coworkers quite high, close to 6 on a 7-point scale. There is a small but significant decrease in the auditors’ initial expectation of coworker friendliness and their perception after working for
166
HEATHER M. HERMANSON ET AL.
Coworkers and Supervisors.
Table 5.
Panel A: Perception of Friendly Coworkers (RQ4A) Time period
Sample sizea
First 6 months
99
Second 6 months
47
Third 6 months
32
Mean (standard deviation)
Difference p-value
Subsequent Initial expectation:b I expect friendly perception: My coworkers coworkers are friendly (1 ¼ strongly disagree; (1 ¼ strongly disagree; 7 ¼ strongly agree) 7 ¼ strongly agree) 5.97 (1.08) 5.98 (.99) 5.94 (1.01)
6.21 (.80) 6.19 (.77) 6.28 (.68)
.23
.0322
.21
.1340
.34
.0509
Panel B: Experiences with Supervision and Mentoring Time period
Sample size
Mean (standard deviation) Number of supervisors
First 6 months
99
Second 6 months
47
Third 6 months
32
Average rating of Percent Number of Percent recent supervisors that meetings that chose (1 ¼ below expectations; had a with a their 3 ¼ exceeds expectations) mentor mentor mentor
7.24 (3.68) 5.75 (3.30) 5.9 (2.8)
2.3 (.51) 2.38 (.53) 2.45 (.51)
90% (.30) 89% (.31) 88% (.34)
2.98 (3.58) 2.28 (2.76) 3.24 (3.57)
21% (.41) 26% (.45) 36% (.49)
Panel C: Perception of Supervisory Mentoring (RQ4B) Time period
Sample size
Mean (standard deviation)
Difference p-value b
Subsequent perception: Initial expectation: I expect mentoring There is adequate from superiors mentoring from superiors (1 ¼ strongly disagree; (1 ¼ strongly disagree; 7 ¼ strongly agree) 7 ¼ strongly agree) First 6 months
99
Second 6 months
47
Third 6 months
32
4.74 (1.38) 4.64 (1.37) 4.75 (1.22)
5.79 (1.11) 5.64 (1.07) 5.66 (.97)
1.05
.0001
1.00
.0001
0.91
.0004
167
A Longitudinal Study of New Staff Auditors’
Table 5. (Continued ) Panel D: Models of Opinion on Supervisory Mentoring Time period
Sample sizec
Overall statistics Model significance R2
First 6 months (busy season) Second 6 months (nonbusy season) Third 6 months (busy season)
74 33 26
.0128 .1660 .0012 .4650
Variable t ( p-value)
Number of meetings with mentor
Mentor Number of Rating of chosen supervisors supervisors
.1184 .5676 .0123 (.0056) (.1243) (.7626) .1980 .0319 .0653 (.0064) (.9425) (.2910) Not significant
.4479 (.1253) .8760 (.0247)
a
For all tables, the sample size is based on the total number of respondents, for most responses the effective sample size is slightly smaller due to some respondents with missing data. In most cases, 1–3 instruments had missing data. b Includes the initial response for only those participants who also completed the later instrument. c The sample size is smaller for the models than for the individual tests as the model requires nonmissing data for all variables.
6 months. The answer to RQ4A is that friendliness of coworkers shows the closest correspondence between initial expectations and subsequent perceptions of any question on the instrument. Regression models with the dependent variable of perception of friendliness of coworkers and the independent variables of number of people the accountant knew or percent of the ‘‘class’’ that the accountant knew when starting work are not significant. The participants also supplied descriptive data on supervision and mentoring along with opinion data on supervision and mentoring. Table 5, Panel B presents descriptive data on supervisors and mentors. The new accountants worked for on average 7.24 supervisors during the first 6 months and then 5.27 and 5.9 supervisors in the following two 6-month periods, respectively. Overall, the new staff accountants evaluations of the quality of their supervision was high; only 2% of supervisors did not meet or exceed expectations in the first two reporting periods and 100% of supervisors met or exceeded expectations in the last reporting period. The firm had a formal mentoring program. Over the 18-month period, 89% of the participants report having a mentor. The experiences with mentors are provided in Table 5, Panel B.
168
HEATHER M. HERMANSON ET AL.
The participants met with their mentors on average 2.98 times in the first 6 months, 2.28 times in the next 6 months, and 3.24 times in the last 6 months. The table also reports the percent of the participants who chose their mentor versus those who were assigned a mentor. Panel C of Table 5 reports the participants’ perceptions of supervisory mentoring. The auditors’ initial expectations regarding adequate mentoring were significantly higher than their subsequent perceptions based on their experiences thus answering RQ4B. To relate actual experiences to the new staff accountants perceptions, models for each period were run that regress the opinion on mentoring against the number of meetings with the mentor, whether the mentor was assigned or chosen, the number of supervisors and how the participant rated his/her supervisors. The results of the regressions are shown in Table 5, Panel D. The models are significant for the first two periods. The opinion on mentoring is significantly positively related to the number of meetings with a mentor and in the second period with the ratings of supervisors. To summarize, the new staff accountants report opinions that are generally positive about their coworkers and supervisors. Their opinion of the friendliness of their coworkers shows the closest correspondence to initial expectations of any of the factors measured. Supervisors generally met or exceeded expectations and 89% of the participants reported having a mentor assigned to them through a formal firmwide mentoring program. These findings may be particularly important to Generation Y workers who value friendship and mentoring and may give accounting firms an edge in the job market for hiring Generation Y workers (Gerdes & Broden, 2007).
Raises and Promotions (RQ5A and RQ5B) All of the instruments asked the participants to report their current salary. The accountants received their raises in the period that corresponded with the third survey, or after they had been with the firm for a year or more. Table 6, Panel A reports the average salary data for the 47 participants who were still in the study at the time of the third instrument. The average starting salary was $41,7988 and the average raise was almost 8%. These participants also reported their opinion of their opportunity for alternative employment on a scale from no opportunity to almost unbounded opportunity in the same period. The employees mean response fell between a ‘‘reasonable opportunity’’ and a ‘‘good deal of opportunity.’’
169
A Longitudinal Study of New Staff Auditors’
Table 6.
Raises and Promotion. Panel A: Salary Data
Sample
Mean (standard deviation)
Second 6 months
Starting salary
Salary after raise
$41,797.87 (4,191.94)
$45,134.47 (4,736.40)
Percent increase
7.98%
B: Perception of Raises (RQ5A) Time period
Sample sizea
Mean (standard deviation)
Difference p-value b
Subsequent Initial expectation: I expect raises perception: I have commensurate with received raises my performance commensurate with (1 ¼ strongly disagree; my performance (1 ¼ strongly disagree; 7 ¼ strongly agree) 7 ¼ strongly agree) Second 6 months
47
Third 6 months
32
3.24 (1.67) 2.90 (1.64)
6.34 (.87) 6.25 (.95)
3.10
.0001
3.35
.0001
Panel C: Perception of Promotions (RQ5B) Time period
Sample sizea
Second 6 months
47
Third 6 months
32
a
Mean (standard deviation)
Difference p-value
Subsequent perception: Initial expectation:b I expect promotions I have received commensurate with a promotion my performance commensurate with (1 ¼ strongly disagree; my performance 7 ¼ strongly agree) (1 ¼ strongly disagree; 7 ¼ strongly agree) 4.36 (1.69) 4.28 (1.86)
6.30 (.93) 6.25 (.95)
1.94
.0001
1.97
.0001
For all tables, the sample size is based on the total number of respondents, for most responses the effective sample size is slightly smaller due to some respondents with missing data. In most cases, 1–3 instruments had missing data. b Includes the initial response for only those participants who also completed the later instrument.
170
HEATHER M. HERMANSON ET AL.
Table 6, Panel B reports the accountants’ perceptions of their raises and promotions. The accountants’ perceptions of their raise experience show the largest difference from their initial expectations. To answer RQ5A, the opinion after working declines significantly from the initial expectation. The next greatest decline is in their perception of the promotion process (Table 6, Panel C). The result on the opinion of the promotion occurs even though the data are gathered at 12 and 18 months. This result particularly indicates that the participants may not have a realistic opinion of the workplace as 12–18 months would be quite soon for a promotion. Thus, for RQ5B there is little correspondence between initial expectation and opinion after working. Several regression models were run to determine why the new accountants’ opinions of their raises dropped so significantly. Independent variables included the amount of the raise, the number of total hours worked, the number of client hours worked, and the opinion on opportunity for alternative employment. None of the models had explanatory power. Two possible explanations exist for the strong negative perceptions regarding raises. The new hires may have simply had unrealistic expectations. One characteristic of Generation Y employees is that they have been recognized and rewarded extensively throughout their lives (Sridharan, 2007), which may lead them to expect frequent or large raises. An alternative explanation is that at the time of the study, starting salaries at accounting firms were consistently rising (Schroeder & Reichardt, 2003). Thus, the employees could have faced some salary compression, where the next year’s ‘‘class’’ were making the same or more than the participants. Prior research has found this type of salary comparison to have a negative effect on job satisfaction (Sweeney & McFarlin, 2005).
LIMITATIONS While this research makes important contribution to the literature, there are some limitations that should be noted. First, the participants were limited to one class of starting auditors who were based in the United States and employed by one international accounting firm; therefore, the results might not be generalizable to other new staff accountants. On the other hand, the study was limited to this group to control for interfirm variance on personnel policies; and prior studies that used samples from multiple firms have generally not found significant between-firm differences.
A Longitudinal Study of New Staff Auditors’
171
Second, the study used a mail survey approach. Mail surveys are subject to nonresponse bias and self-selection bias. That is, the respondents might not be representative of the entire population (Alreck & Settle, 1985). Further, mail surveys are limited with respect to the clarity of the questions and the interpretations of the questions by the respondents. This mail survey is also limited due to some issues in consistency of wording between the initial instrument and the follow-up instruments. Third, all of the data are self-reported. Self-report biases include trying to give the responses the participants believe the administrator desires and/or providing responses that the respondents feel will reflect positively on them. Fourth, this is a longitudinal study and as such is subject to the biases inherent in longitudinal research, such as: anchoring, where participants try to be consistent with their previous answers, priming, where the participants are sensitized to the future questions from the prior instruments, history effects, where there is an uncontrollable changes in the external environment that affect participant answers, and withdrawal effects, which results from participants dropping out in a systematic fashion. Given the length and time spacing of the instruments and the significant drops in most of the perception data, anchoring does not appear to be occurring. Priming could be occurring as the participants might to trying to guess ‘‘expected’’ answers from the prior questions. The study is subject to a history effect in that the Enron scandal happened during the 18-month period under consideration. However, the effect of this scandal on the participants is unclear. The participants’ perceptions on the workplace dropped after the first 6 months, but the Enron scandal did not commence until the second 6-month period. Finally, while the study lost participants over the 18 months, there was no discernable demographic bias between the participants who withdrew and the participants who continued to provide information.
CONCLUSIONS The purpose of this study was to examine within subject change over time using the participants’ initial expectations about the public accounting workplace, their actual experiences in that workplace, and their subsequent
172
HEATHER M. HERMANSON ET AL.
Table 7.
Summary of Results (Based on 7-Point Likert Scale).
Factor
Difference from Initial Expectation Mean response
Flextime scheduling Work at home opportunities Balance of work and family activities Training Variety of work Challenge of work Friendly coworkers Mentoring Raises Promotion
First 6 months
Second 6 months
Third 6 months
.70 3.88 .99 3.22 1.05 4.51 .58 4.79 .78 5.09 1.21 4.88 .23 5.97 1.05 4.74 NA
NS 4.18 .94 3.51 1.26 4.48 .53 4.85 .66 4.98 .99 5.12 NS 5.98 1.00 4.64 3.10 3.24 1.94 4.36
.85 3.74 .83 3.67 1.61 3.95 .75 4.50 .33 5.13 .66 5.47 NS 5.94 .91 4.75 3.35 2.90 1.97 4.28
NA
NS, not significant; NA, not applicable.
job perceptions about that workplace. The results show, consistent with prior research, that initial expectations were high and that subsequent to work experiences, the accountants’ opinions dropped significantly on a variety of factors. Table 7 summarizes the results of the study presenting the decline from the initial expectation and the opinion at each time period using a 7-point Likert scale. For 25 out of 27 measurement points, the accountants’ opinions declined from their initial expectations. The results are remarkably similar to other studies on new staff accountants’ expectations and subsequent opinions (Dean et al., 1988; Carcello et al., 1991; Hiltebeitel et al., 2000). The similarity in results indicate that new staff accountants from Generation Y are not different with respect to occupational reality shock when starting a new job than previous new staff accountants. Employees with unmet expectations have been found to have more negative job attitudes and behaviors which have, in turn, been linked to turnover (Padgett et al., 2005;
A Longitudinal Study of New Staff Auditors’
173
Dean et al., 1988). The participants in this study did experience high turnover; of the 185 participants who responded to the initial survey, 89 (48.11%) left the firm within 3 years. The actual experiences reported by the participants appear to be realistic and reflect the types of experience and training for which many enter large public accounting firms (Marxen, 1996). Interestingly, the actual experiences do not appear to be driving the opinions of the accountants. Only two significant relationships were found between the work activities and opinions – the length of their commutes was negatively related to the opinion on ability to balance work and family life, and the number of meetings with a mentor was positively related to the opinion on mentoring. For the other opinions, the work factors that would seem to drive the opinion were not significantly related. For example, the amount of the raise was not related to the opinion about raises, number of hours worked was not related to ability to balance work–family life, training hours was not related to the opinion on training, number of clients was not related to the opinion about variety in the work, and so forth. These results raise the question of what is driving the declines in new staff accountants’ opinions. There appear to be two alternative explanations. The first is that the accounting firm or others may be ‘‘overselling’’ the positives of the public accounting environment. Dean et al. (1988) suggests that initial expectations of the workplace are developed over time from societal stereotyping, professional education from academic institutions, and organizational entry via the employee recruitment and orientation process. During these development phases, the accountants may be obtaining too positive of a view of the public accounting workplace. Alternatively, the new accountants may be overly optimistic or misinterpreting the messages from the firm or faculty. Perhaps high-achieving, Generation Y students, who land jobs with the largest CPA firms, are so excited that they are bound to be disappointed (Poznanski & Bline, 1997). The results may be particularly strong for this group of participants because they had little prior work experience; occupational reality shock is more prevalent for inexperienced workers (Padgett et al., 2005). Accounting firms can use these results to modify their relations with new staff accountants. The new staff accountants’ opinions after working were significantly lower than their initial expectations. To alleviate these declines, the firms might want to improve new staff communications. Some firms are already taking steps to improve communications with students and new hires by using web sites with video interviews and rap songs, as well as selfscheduled performance reviews (Gerdes & Broden, 2007). Another possible
174
HEATHER M. HERMANSON ET AL.
method of combating occupational reality shock is for firms to schedule more morale-building events during the first 6 months of new accountant employment. Finally, firms might want to develop retention policies aimed at new staff accountants, such as offering a ‘‘staying bonus’’ at the end of 3 years, offering earlier vesting in retirement plans, or offering seniority-based privileges such as 4 weeks of vacation at an earlier tenure (e.g., after 2 years). Changing actual work experiences might also be effective at least with respect to reducing commuting times and ensuring all employees have and meet with mentors several times per year. One bright spot for the accounting firms is that, in spite of the declining opinions and high turnover, the accountants who continued in this study were still relatively positive (above the midpoint of 4 on the 7-point scale) about several of the work factors. These include the ability to balance work and family life, their training, the variety and challenge in their work, and their relationships with coworkers and supervisors (see Table 7).9 This result is also similar to previous studies (Carcello et al., 1991; Hiltebeitel et al., 2000). Faculty must also be involved in helping new accountants to develop realistic expectations of the workplace. Prior research has found that professors do not communicate enough information about the professional environment to students (DeZoort et al., 1997). Faculty should share relevant work environment data, such as the data presented in this paper, with students prior to their acceptance of jobs with CPA firms. Faculty members and recruiters particularly need to communicate to students realistic expectations about raises and promotions. Interestingly, while some may argue that internships may provide job candidates with more realistic expectations about their future work, prior research found that interns and non-interns experienced similar declines in job perceptions once in permanent staff accounting positions (Hermanson, Hill, & Ivancevich, 2005). Thus, internships do not appear to provide a more realistic view of the firm. Instead, firms appear to be using internships more as a recruiting tool rather than a socialization tool (Hermanson et al., 2005). Accountant satisfaction and turnover have been the subjects of extensive previous research. This study approached the topic using a within-subjects design focusing on changes in individual perceptions over time. Given the continuing focus on the staffing of the accounting profession, additional research on accounting firm turnover in the post-SOX environment should be conducted. Ultimately, the accounting profession needs to develop ways to attract and retain the best and brightest young business students. By so doing, firms and the profession will strengthen the United States’ capital markets and promote reliable financial reporting.
A Longitudinal Study of New Staff Auditors’
175
NOTES 1. There exists some ambiguity with respect to the start year of the various generations. The start year varies from 1 to 4 years from study to study. Further, some studies include ‘‘generations’’ between the three major categories listed in this paper. 2. Job satisfaction and turnover have been linked to a long list of antecedent and intervening variables such as professional commitment, organizational commitment, organizational trust, job stress, job insecurity, role conflict, role ambiguity, burnout, and intention to turnover. For simplicity and brevity, where a previous study found, for example, that a factor is related to organizational commitment, which is in turn related to job satisfaction, this review infers the more direct relationship. 3. A sample was used rather than surveying 100% of the new hires to help protect the respondents’ anonymity. 4. The number of participants that began work in each month was 13 in July, 14 in August, 3 in September, and 95 in October. We examine whether the difference in start month affected the new staff accountants’ initial expectations by running ANOVA models with the dependent variable as each initial expectation and the expected work hours. For the 13 models run, there were 2 significant differences; the 85 accountants that started in October expected to work longer hours than the others and the 14 accountants that started in August expected friendlier coworkers. We do not believe that these differences affect our results. 5. Because the responses were obtained via a mail survey, a test was made to determine if there were differences between early and late responders. This test generally proxies for nonresponse bias. t-tests were computed for the numeric demographic variables and no significant differences were found between early and late responders. Significance is held at 0.05 or less for all tests throughout this research. 6. All of the survey questions were worded as work versus family life. However, the participants in the study were overwhelmingly young, single, with no children, and no homeowner responsibilities; thus, family life would proxy for leisure time (see Table 1, Panel B). 7. The busy season definition could also be affected by industry, because some industries have alternative year-ends that would cause their busy time to fall outside of January to March. 8. For the entire 185 participants who responded to the first instrument, the average starting salary was $42,823. 9. The results are for those who remained with the firm and responded to the survey instruments, and thus may be positively biased.
ACKNOWLEDGMENTS We would like to thank AICPA Women and Family Issues Executive Committee, Dixon Huges LLP, Kennesaw State University, and The
176
HEATHER M. HERMANSON ET AL.
University of North Carolina Wilmington for sponsoring this research. Also, we are greatly indebted to the accounting firm who provided the participants for this research. We would also like to acknowledge the paper’s reviewers, who provided many helpful comments that significantly improved this work.
REFERENCES Almer, E., & Kaplan, S. (2002). The effects of flexible work arrangements on stressors, burnout, and behavioral job outcomes in public accounting. Behavioral Research in Accounting, 14, 1–34. Alreck, P., & Settle, R. (1985). The survey research handbook. Homewood, IL: Irwin. American Accounting Association. (1993). Committee on the future structure, content, and scope of accounting education: Improving the early employment experience of accountants: Issues statement no. 4. Issues in Accounting Education, 8, 431–435. Bamber, E., Snowball, D., & Tubbs, R. (1989). Audit structure and its relation to role conflict and role ambiguity: An empirical investigation. The Accounting Review, 64(2), 285–299. Bell, N., & Narz, M. (2007). Meeting the challenges of age diversity in the workplace. The CPA Journal, 77(2), 56–59. Bernardi, R., & Hooks, K. (2001). The relationships among lifestyle preference, attrition, and career orientation: A three-year longitudinal study. Advances in Accounting Behavioral Research, 4, 207–232. Carcello, J., Copeland, J., Jr., Hermanson, R., & Turner, D. (1991). A public accounting career: The gap between student expectations and accounting staff experiences. Accounting Horizons, 5(3), 1–11. Dalton, D., Hill, J., & Ramsay, R. (1997). Women as managers and partners: Context specific predictors of turnovers in international public accounting firms. Auditing: A Journal of Practice and Theory, 16, 29–50. Dean, R., Ferris, K., & Konstans, C. (1988). Occupational reality shock and organizational commitment: Evidence from the accounting profession. Accounting, Organizations and Society, 13, 235–250. DeZoort, F., Lord, A., & Cargile, B. (1997). A comparison of accounting professors’ and students’ perceptions of the public accounting work environment. Issues in Accounting Education, 12(2), 281–298. Fisher, R. (2001). Role stress, the type A behavior pattern, and external auditor job satisfaction and performance. Behavioral Research in Accounting, 13, 144–170. Gaertner, J., & Ruhe, J. (1981). Job-related stress in public accounting. Journal of Accountancy, 151(6), 68–74. Gerdes, L., & Broden, F. (2007). The best places to launch a career. BusinessWeek (September 24), 49–60. Harrell, A. (1990). Longitudinal examination of large CPA firm auditors’ personnel turnover. Advances in Accounting, 8, 233–246.
A Longitudinal Study of New Staff Auditors’
177
Hazard, C. (2007). 2 generations, 1 perfect match: Young millennials find willing mentors in baby boomers. Knight Ridder Tribune Business News, September 8. Hermanson, H., Hill, M., & Ivancevich, S. (2002). Who are we hiring? Characteristics of entrants to the profession. The CPA Journal, 72(8), 67–69. Hermanson, H., Hill, M., & Ivancevich, S. (2005). The effect of internships on new accountants. Working Paper. Kennesaw State University, Kennesaw, GA. Hiltebeitel, K., Leauby, B., & Larkin, J. (2000). Job satisfaction among entry-level accountants. The CPA Journal, 70, 76–78. Kaplan, S., Keinath, A., & Walo, J. (2001). An examination of perceived barriers to mentoring in public accounting. Behavioral Research in Accounting, 13, 195–220. Marxen, D. (1996). The big 6 experience: A retrospective account by alumni. Accounting Horizons, 10, 73–87. Moyes, G., Williams, P., & Koch, B. (2006). The effects of age and gender upon the perceptions of accounting professionals concerning their job satisfaction and work-related attributes. Managerial Auditing Journal, 21(5), 536–561. Padgett, M., Gjerde, K., Hughes, S., & Born, C. (2005). The relationship between preemployment expectations, experiences, and length of stay in public accounting. Journal of Leadership and Organizational Studies, 12(1), 82–102. Pasewark, W., & Viator, R. (2006). Sources of work–family conflict in the accounting profession. Behavioral Research in Accounting, 18, 147–165. Patten, D. (1995). Supervisory actions and job satisfaction: An analysis of differences between large and small public accounting firms. Accounting Horizons, 9, 17–28. Peterson, R., & Devlin, J. (1998). Attitudes of graduating accounting seniors on entrylevel positions: An international comparison. Journal of Education for Business, 71(1), 54–57. Poznanski, P., & Bline, D. (1997). Using structural equation modeling to investigate the causal ordering of job satisfaction and organizational commitment among staff accountants. Behavioral Research in Accounting, 9, 154–171. Rasch, R., & Harrell, A. (1990). The impact of personal characteristics on the turnover behavior of accounting professionals. Auditing: A Journal of Practice and Theory, 9, 90–102. Reed, S., & Kratchman, S. (1989). A longitudinal and cross-sectional study of students’ perceptions of the importance of job attributes. Journal of Accounting Education, 7(2), 171–193. Robert Half International. (2007). The red carpet treatment. Journal of Accountancy, 204(2), 36–39. Scandura, T., & Viator, R. (1994). Mentoring in public accounting firms: An analysis of mentor-prote´ge´ relationships, mentorship functions, and prote´ge´ turnover intentions. Accounting, Organizations and Society, 19(8), 717–734. Schroeder, D., & Reichardt, V. (2003). Members’ salaries are still going up. Strategic Finance, 84(12), 27–40. Sridharan, V. (2007). Generation gaps can lead to conflicts: But corporate cultures can encompass all ages. Knight Ridder Tribune Business News, September 12. Stallworth, H. (2003). Mentoring, organizational commitment and intentions to leave public accounting. Managerial Auditing Journal, 18(8), 405–418. Sweeney, J., & Summers, S. (2002). The effect of the busy season workload on public accountants’ job burnout. Behavioral Research in Accounting, 14, 223–245.
178
HEATHER M. HERMANSON ET AL.
Sweeney, P., & McFarlin, D. (2005). Wage comparisons with similar and dissimilar others. Journal of Occupational and Organizational Psychology, 78(1), 113–131. Trump, G., & Hendrickson, H. (1971). Staff retention in public accounting firms. Journal of Accountancy, 131(1), 87–90. Trunk, P. (2007). What Gen Y really wants. TIME, 170(2), 46. Viator, R. (1999). An analysis of formal mentoring programs and perceived barriers to obtaining a mentor at large public accounting firms. Accounting Horizons, 13, 37–53. Walmsley, P. (2007). Playing the workforce generation game. Strategic HR Review, 6, 32–36.
APPENDIX A. WORK/LIFE BALANCE RESEARCH PROJECT: PERCEPTIONS AT INITIAL EMPLOYMENT
Last 5 digits of your SSN ___________________
Please fill in the blank, circle the number, or check the box or line as appropriate. If information is not available or not applicable, please indicate using N/A. Please complete all pages of this questionnaire. Thank you for your participation in this research project. Demographic Information 1. Date employment with firm began (MM/DD/YY) _____ 2. Employed in: Audit _____ Tax _____ Consulting _____ Other _____ 3. Office address: Street _____ City _____ State _____ 4. In size, your office would be considered: Small _____ Medium _____ Large _____ 5. Your office is considered: a Main location _____ a Satellite location _____ 6. Initial annual salary _____ 7. Date of birth (MM/DD/YY) _____ 8. Gender: Male _____ Female _____ 9. Race: White Black Asian Hispanic Native American Multiracial (Circle one) 10. Marital status: Single _____ Married _____ Divorced _____Widowed _____ 11. Number of children _____, Dates of birth _____(MM/DD/YY for each child) 12. Current living quarters: Apt._____ Parent’s home_____ Own home _____ Other _____
179
A Longitudinal Study of New Staff Auditors’ Educational Background 1. 2. 3. 4. 5. 6. 7.
Undergraduate degree _____ Subject _____ What was your overall undergraduate GPA? _____ What was your total SAT or ACT score? _____ Did you participate in an internship program? Yes _____ No _____ Do you have other previous full-time work experience? Yes _____ No _____ Approximately how many people at your level started during this hiring period _____ Approximately how many people at your level did you know before beginning work ________ Initial Job Expectations
1. 2.
How many hours per week do you expect to work on average during: Busy Season? _____ Nonbusy Season? _____ What percentage of the time do you expect to travel out of town on job assignments? _____%
What other job expectations do you have based on your hiring process (scale of 1 ¼ strongly disagree to 7 ¼ strongly agree)?
Please circle the number corresponding to your expectation. 1 ¼ Strongly Disagree I I I I I I I I
expect the work to be challenging expect variety in my work assignments expect to enjoy my work expect friendly coworkers expect extensive training expect mentoring from superiors expect flextime scheduling expect to be able to work at home when possible I expect a reasonable balance of work and family activities I expect raises commensurate with my performance I expect promotion commensurate with my performance
7 ¼ Strongly Agree
1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4
5 5 5 5 5 5 5 5
6 6 6 6 6 6 6 6
7 7 7 7 7 7 7 7
1
2
3
4
5
6
7
1
2
3
4
5
6
7
1
2
3
4
5
6
7
180
HEATHER M. HERMANSON ET AL.
APPENDIX B. WORK/LIFE BALANCE RESEARCH PROJECT: BI-ANNUAL PERCEPTIONS AND EXPERIENCES
Last 5 digits of your SSN ___________________
Thank you for your participation in this study. Please fill in the blank, circle the number, or check the box or line as appropriate. If information is not available or not applicable, please indicate using N/A. Please complete all pages of this questionnaire. Demographic Information Some of these questions are similar to ones asked in the last questionnaire. We are interested in whether your circumstances have changed in the last 6 months. 1. 2. 3. 4.
Current Salary _____ Change in marital status over the last 6 months: None _____ Married _____ Divorced _____Widowed _____ Number of New Children _____, Dates of Birth (MM/DD/YY) _____ Change in current living quarters: None _____ Moved to: Apt._____ Parent’s home _____ Own home _____ Other _____ Summary of Professional Experiences
Using your time report software as needed, please address the following questions for the period December 1, 2000 through May 31, 2001. 1. 2. 3. 4. 5. 6.
How many hours did you charge to client engagements during the period? _____Hours How many hours were charged as nonengagement hours during the period? _____Hours How many hours were charged to training during the period? _____Hours How many vacation hours did you take? _____Hours How many sick hours did you take? _____Hours How many total hours did you work? _____Hours
Detail of Professional Experiences (December 1, 2000 through May 31, 2001)
Client Name
Ownership Code (1) Private (2) Public
Example Co.
2
Industry Number 1. Automotive 2. Energy 3. Financial Services 4. Health and Life Sciences 5. Govt. 6. Manufacturing 7. Real Estate 8. Retail & Consumer Products 9. Tech, Comms & Entertain 6
Client’s Total Revenue
Is this a new (N) or continuing (C) client for you?
Approximate number of weeks you worked on the client in this time period
Number of Hours charged to client
Approximate commute time in minutes from your HOME to client OR did job qualify for an overnight stay (OS)
What type of work?
$150 M
C
3 weeks
130
25 mins
1
(1) Audit (2) Pension Plan (3) Compilation (4) Review (5) Other attest (6) Other
A Longitudinal Study of New Staff Auditors’
Using your time report software as needed, please provide the following information for the clients to which you were assigned during the period December 1, 2000 through May 31, 2001.
1. 2. 3. 4. 5. 6. 7.
9.
181
8.
182
HEATHER M. HERMANSON ET AL.
Job Perceptions Please circle the number that indicates your level of agreement with the following statements N/A ¼ Not applicable, 1 ¼ Strongly disagree, 7 ¼ Strongly agree 1. The tasks I perform are challenging. 2. The tasks I perform are interesting. 3. I experience variety in my work assignments. 4. I enjoy my work activities. 5. My coworkers are friendly. 6. I have received adequate training. 7. There is adequate mentoring from superiors. 8. Mentoring has been valuable to my career 9. Mentoring has been valuable to my personal growth 10. I am able to balance my work and my family life. 11. There are adequate opportunities for flextime scheduling or customized work arrangements. 12. I am able to work at home when possible. 13. I have received raises commensurate with my performance. 14. I have received a promotion commensurate with my performance.
N/A 1 N/A 1 N/A 1
2 2 2
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
N/A N/A N/A N/A
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
6 6 6 6
7 7 7 7
N/A 1
2
3
4
5
6
7
N/A 1
2
3
4
5
6
7
N/A 1
2
3
4
5
6
7
N/A 1
2
3
4
5
6
7
N/A 1
2
3
4
5
6
7
N/A 1
2
3
4
5
6
7
N/A 1
2
3
4
5
6
7
Other Perceptions How would you describe your opportunity for alternative employment? _____ _____ _____ _____ _____ _____ _____
No opportunity. Little opportunity. Some opportunity. Reasonable opportunity. A good deal of opportunity. A great deal of opportunity. Almost unbounded opportunity.
A Longitudinal Study of New Staff Auditors’
183
In your office, how fairly are promotions handled? _____ _____ _____ _____ _____ _____ _____ _____
Never fairly. Seldom fairly. Occasionally fairly. About half the time fairly. A good deal of the time fairly. Most of the time fairly. All of the time fairly. Don’t know yet. Mentoring and Supervision
Please list the number of supervisors that you have worked for during the period December 1, 2000 through May 31, 2001. _____ In general, please rate the quality of your supervisors _____ Above expectations. _____ At expectations. _____ Below expectations. I have a mentor Yes _____ No _____ I met with my mentor _____ times during this 6-month period (number of meetings) THANK YOU AGAIN FOR YOUR PARTICIPATION IN THIS STUDY
AN EXAMINATION OF BUSINESS STUDENT PERCEPTIONS: THE EFFECT OF MATH AND COMMUNICATION SKILL APPREHENSION ON CHOICE OF MAJOR Wilda F. Meixner, Dennis Bline, Dana R. Lowe and Hossein Nouri ABSTRACT Communication researchers have observed that students will avoid majors that require the use of certain skills where the individual exhibits a high level of apprehension toward those skills. Historically, accounting has been perceived as requiring more math skills and fewer communication skills than other business majors so accounting has typically attracted students with low math apprehension and high communication (written and oral) apprehension. The current study investigates whether business students’ perceptions across business majors regarding the level of mathematics, writing, and oral communication skills required for accounting reflect the recent changes in pedagogy and curriculum content for the accounting major. Advances in Accounting Behavioral Research, Volume 12, 185–200 Copyright r 2009 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1475-1488/doi:10.1108/S1475-1488(2009)0000012010
185
186
WILDA F. MEIXNER ET AL.
The results indicate that the perception of skills required to be an accounting major by students in other business majors (more math and less communication) is different from the perception of accounting majors. On the other hand, accounting majors’ perceptions of the skills needed to be in an alternative business major is generally similar to students in the respective major. These observations may lead to the interpretation that accounting majors have gotten the word that professional expectations of accountants involve substantial communication skill while that message has apparently not been shared with students who elect to major in other business fields.
INTRODUCTION Much has been written about attracting more and better students to accounting as an academic major. Some studies have examined the decision criteria used by business students when choosing accounting as their academic major (e.g., Gul, Andrews, Leong, & Ismail, 1989; Cohen & Hanno, 1993). Cohen and Hanno (1993) identified factors such as the effect of career counseling, math skills, perceived quantitative orientation, and workload in accounting courses. Gul et al. (1989) found that students were concerned about potential for high levels of earnings, satisfaction in the chosen career, and availability of employment. One aspect of the decision process to major in accounting not considered thus far is the student’s perception of the skills necessary to succeed in each business discipline as compared to other business disciplines and the individual’s feelings about the skills required. While communication researchers have observed that students will generally avoid majors that require the use of skills where the individual has a high level of apprehension (Daly & McCroskey, 1975; Daly & Shapiro, 1978; Bennett & Rhodes, 1988), they have not made direct comparisons across business disciplines. In the accounting education literature, Simons, Higgins, and Lowe (1995) observed that accounting majors have higher writing and oral communication apprehension than other business majors but they did not link apprehension to level of skill required in each major. In another accounting education study, Cohen and Hanno (1993) found that students perceive the accounting major as requiring more math skills than other business majors, but they did not measure math apprehension. The inference from the communication and accounting studies is that the accounting major has typically attracted
Examination of Business Student Perceptions
187
students with relatively higher levels of math ability and lower levels of math apprehension. Conversely, it can also be inferred that the accounting major has typically attracted students with relatively high communication apprehension and lower levels of communication skills. However, none of the previous research has specifically linked student perceptions of skills required to succeed in a business major with apprehension. The current study bridges this gap by investigating both variables across five business majors. Understanding how students view accounting and other business disciplines is the first step to attracting better students into the accounting major. Academics and practicing accountants would agree that both quantitative skills as well as oral and written communication skills are important for success in the accounting profession (AICPA, 1988; Arthur Andersen & Co et al. 1989; AECC, 1990). Anecdotal evidence would suggest that accounting professionals perceive that accounting graduates are reasonably good technical accountants but their communication skills fall short of expectations. However, professionals are typically unable to verbalize what would qualify as better communication skills other than general statements. Helping accounting graduates attain the level of communication skills necessary to be successful accounting professionals can result from a revised accounting curriculum and/or from attracting students with different abilities into the major. With regard to the curriculum, accounting educators at many schools have redesigned or enhanced the accounting curriculum to incorporate the development of the accounting major’s communication skills. However, these changes may not result in accounting graduates who are better prepared to meet professional expectations if there is not a change in the level of writing and oral communication apprehension of students attracted to the major. While changing the curriculum is under the control of the accounting faculty, the choice of academic major is under the student’s control. Given that students will avoid majors where they have a high apprehension level for the skills perceived to be required, there must be a change in student perceptions if the accounting major is going to attract students with different (i.e., greater communication) abilities. That is, the students attracted to the major may not change if students continue to perceive that strong communication skills are not needed to be successful in the accounting major. The current study investigates perceptions of mathematics, writing, and oral communication skills required in different business majors (i.e., accounting, finance, information systems, management, and marketing) and the levels of math and communication apprehension (oral and writing)
188
WILDA F. MEIXNER ET AL.
exhibited by the respondents. The analysis presents information as to how accounting students view the use of certain skills in the accounting major as compared to how students in other business majors view the use of those skills in the accounting major. The analysis also investigates how accounting and other business majors view the use of those skills in the other business majors. Finally, the paper investigates how apprehension to the different skills affects the choice of major. The remainder of the paper is organizes as follows. The next section of the paper presents the theoretical background and the research questions. Subsequent sections present the research methods and the results. The final section presents the discussion, conclusions, and limitations.
THEORY AND RESEARCH QUESTIONS A student’s perception of the skills required to succeed in the chosen career is one issue that may influence the selection of an academic major. Cohen and Hanno (1993) found that a student’s perception as to the requirement for quantitative skills is significant in deciding whether to major in accounting. Other researchers (e.g., Felton, Dimnick, & Northey, 1995) have observed that a student’s perception of other required skills can also influence a student’s choice of major. Neither Cohen and Hanno (1993) nor Felton et al. (1995) analyzed student perceptions as to the requirement of communication skills (writing or oral) in deciding whether to major in accounting. The current study extends the work of Cohen and Hanno by investigating the perception of prospective accounting majors regarding the relative importance of math, writing, and oral communication skills to be a successful accounting major. Given that writing and oral communication skills have not been investigated, the following research question is proposed. RQ1. Do accounting majors perceive an equal level of math, writing, and oral communication skill is required in the accounting major? For students to make a sound choice in academic major, an accurate perception of the required skills of each major is important. Cohen and Hanno (1993) observed that students use the perceived quantitative nature of accounting as a factor in deciding to choose a major other than accounting. A problem may exist if students have an outdated perspective of the accounting major due to failure to be informed of the significant changes
Examination of Business Student Perceptions
189
that have occurred in the profession and the curriculum. As a result, students who choose to major in accounting may have different beliefs about the skills required to be an accounting major than do students who choose a business major other than accounting. If students are choosing to not major in accounting because of a different perspective of the skills necessary to be an accounting major, then accounting programs may be failing to attract qualified applicants because of misinformation. The current study investigates the perception of students choosing to major in accounting with the perception of students choosing to major in other business areas regarding the math, writing, and oral communication skills required by the accounting major, hence research question two: RQ2. Do students who choose a business major other than accounting perceive the same level of math, writing, and oral communication skill requirements in the accounting major as do students choosing accounting as a major? Another equally important issue is for students to have a realistic perception of the skills needed to succeed in alternative business majors. Accounting students should choose a major after considering the alternatives based on accurate information. Another aspect of the current study is an investigation into the perception accounting majors have about other business majors with regard to math, writing, and oral communication skills. These perceptions can then be compared to the amount of skills perceived to be required by students who have chosen the alternative business major. Hence, the third and fourth research questions: RQ3. Do students who choose the accounting major perceive the same level of math, writing, and oral communication skill requirements in alternative business majors as are perceived to be required in the accounting major? RQ4. Do students who choose the accounting major perceive the same level of math, writing, and oral communication skill requirements in alternative business majors as do students who have chosen those majors? Table 1 contains a matrix that graphically depicts the concepts and groups being compared in the four research questions. Resnick, Viehe, and Segal (1982) investigated the relationship between math anxiety and avoidance of math courses and vocational choices involving
190
WILDA F. MEIXNER ET AL.
Table 1.
Research Questions Investigated. Accounting Major Perceptions
Math, writing, and oral communication skills needed to be a successful accounting major Math, writing, and oral communication skills needed to be successful in other business major
RQ1
Versus accounting major (RQ3)
Other Business Major Perceptions Versus accounting major perceptions (RQ2) Versus accounting major perceptions (RQ4)
math skills. The authors concluded that math anxiety is a critical filter in both educational and occupational choices. Bennett and Rhodes (1988) discovered that individuals with high levels of writing apprehension were more likely to occupy jobs with minimal writing requirements, while Daly and Shapiro (1978) discovered that students who tested high in writing apprehension were more likely to choose majors that were perceived as requiring less writing. An early investigation into oral communication apprehension affecting vocational choice indicated that high communication apprehensives perceived low-communication occupations as more desirable and that low apprehensives perceived high-communication occupations as more desirable (Daly & McCroskey, 1975). A body of literature in accounting (Stanga & Ladd, 1990; Simons et al., 1995; Ruchala & Hill, 1994) has investigated the existence of communication apprehension among various groups of business majors. The findings indicate that accounting majors have higher apprehension about oral communications and writing than other business students. These prior studies support the position that among the criteria students will use when selecting an academic major is their level of apprehension to the skills perceived to be required to succeed in that major. Based on the results reported by Cohen and Hanno (1993) indicating that quantitative skills are significant in deciding whether to be an accounting major, the accounting major respondents’ self-reported level of math apprehension are expected to be less than their self-reported level of writing and oral communication apprehension. As a result, the following research question is also investigated in the current study: RQ5. Do accounting majors exhibit a lower level of math apprehension and a higher level of writing and oral communication apprehension when compared to other business majors?
191
Examination of Business Student Perceptions
METHODOLOGY Participants Data for the current study were gathered from approximately 800 sophomore business majors enrolled in principles of accounting classes at schools located in the Northeast, Middle Atlantic, and Southwest regions of the United States. The survey was administered early in the semester so the course should have little impact on the desirability of accounting as a possible major. Table 2 provides the average GPA and combined SAT score for each group of students categorized by self-reported major. Although an inverse relationship exists between GPA and SAT score, we believe this is not related to courses taken because all of the respondents are sophomores; therefore, the classes taken by each major is basically the same at this point. Although no tests of significance were conducted, the GPAs reported here appear to be relatively homogeneous between Accounting and Finance, and between Management and Marketing, as would be expected. Information Systems majors reported the highest GPA but lowest SAT of all groups. Data Respondents answered a variety of questions including their current major, their perceptions of the level of math, writing, and oral communication skills required by each business major, and a series of questions that measure their apprehension level toward these skills. Instructions for the perception of skills needed section of the survey included a reminder that not all business majors require the same level of certain skills. This statement was designed to provide an anchor for respondents as they indicated their perception of Table 2.
Average GPA and SAT by Responding Major. Information
Accounting
Finance
Systems
Management
Marketing
284 2.52 1,039
135 2.64 1,069
58 3.14 971
141 3.03 989
172 2.94 1,020
n GPAa SAT a
4-point scale.
192
WILDA F. MEIXNER ET AL.
the level of mathematics, writing, and oral communication each of the five business majors would require. Apprehension measures were obtained using publicly available scales used in previous studies. Responses were on a 7-point Likert scale (1 ¼ almost none, 7 ¼ a great deal). Cronbach’s (1951) alpha for the math, writing, and oral communication apprehension measures were .87, .96, and .96, respectively. Responses are reported by major: accounting, finance, information systems, management, and marketing.
RESULTS Table 3 provides a matrix of responses that indicate each major group’s perception with regard to the level of the three skills (math, writing, and oral communication) required of students majoring in accounting. Data to assess Research Question 1 (Do accounting majors perceive an equal level of math, writing, and oral communication skill is required in the accounting major?) is found in the Accounting column of Table 3. This column indicates that respondents who chose to be an accounting major perceive that success requires a greater level of math skill (5.91) than oral communication skill (5.39) and more oral communication skill than writing skill (4.59). Each of these mean levels are significantly different from each other ( po.001). These results support the findings of Cohen and Hanno (1993) who found that accounting majors are more likely to desire a career in a field that works with numbers. Table 3 also presents the perceptions of math, writing, and oral communication skills needed to be an accounting major from the perspective of students who chose another business major. In general, students in other Table 3.
n Math Writing Oral
Perceived Level of Skill Required for Accounting Major by Respondent Major.
Accounting
Finance
Information Systems
Management
Marketing
284 5.91 4.59 5.39
135 6.20 4.10** 4.72**
58 6.07 4.37 5.03
141 6.36* 4.17** 4.62**
172 6.07 4.11** 4.68**
Scale: 1 ¼ almost none, 7 ¼ a great deal. *Significant difference from Accounting respondents, po.05. **Significant difference from Accounting respondents, po.001.
Examination of Business Student Perceptions
193
business majors perceived that the accounting major requires a greater level of math skill and less writing and oral communication skills than was perceived by students who are accounting majors. With regard to the level of math skill, analysis of variance results indicate that only accounting majors (5.91) and management majors (6.36) differ significantly ( po.05). Management majors perceive that there is more math required in accounting than is perceived by accounting majors. With regard to writing and oral communication skills, respondents majoring in finance, management, and marketing each perceive that the accounting major requires significantly less of these skills than was perceived by the accounting majors ( po.001). Only those students who chose to major in information systems did not perceive that the accounting major required less writing and oral communication skills. Thus, the answer to Research Question 2 (Do students who choose a business major other than accounting perceive the same level of math, writing, and oral communication skill requirements in the accounting major as do students choosing accounting as a major?) is mixed. While students in most other business majors perceive the same level of math skill needed to be an accounting major as do accounting majors, the perception of the level of oral communication and writing skills needed to be an accounting major is significantly lower by most other majors when compared to the perception of accounting majors. These findings generally indicate that students choosing a business major other than accounting do not have the same perception of the skills required to be an accounting major as students who choose to be an accounting major. Table 4 presents a matrix of responses revealing how accounting majors perceive skill requirements of alternative business majors with regard to math, writing, and oral communication. This table provides insight into how accounting majors view the requirements to be a major in other disciplines in contrast to how they view the requirements to be an accounting major. The accounting major respondents perceive that the finance major requires significantly more ( po.001) math skills (6.25) than accounting (5.91), but that the accounting major requires significantly more ( po.001) math skills than information systems (5.02), management (4.66), or marketing (4.70). Conversely, accounting major respondents perceive that each of the other majors requires significantly more ( po.001) writing and oral communication skills than accounting with two exceptions, oral communication for the finance major (5.39 versus 5.41) and the information systems major (5.39 versus 5.23). Thus, the answer to Research Question 3 (Do students who choose the accounting major perceive the same level of math, writing, and oral communication skill requirements in alternative business
194
WILDA F. MEIXNER ET AL.
Table 4.
Perceived Level of Skill Required for Business Disciplines by Accounting Majors.
Accounting Math Writing Oral communication
5.91 4.59 5.39
Finance Math Writing Oral communication
6.25*** 4.86*** 5.41
Information systems Math Writing Oral communication
5.02*** 5.01*** 5.23
Management Math Writing Oral communication
4.66*** 5.78*** 6.56***
Marketing Math Writing Oral communication
4.70*** 5.92*** 6.46***
Scale: 1 ¼ almost none, 7 ¼ a great deal; n ¼ 284. ***Significant difference from Accounting respondents, po.001.
majors as are perceived to be required in the accounting majors?) is that students majoring in accounting do not perceive the same level of skill requirements in other majors when compared to the accounting major. Table 5 presents a comparison of how accounting major student perceptions of math, writing, and oral communication skill requirements of other majors compare to the perceptions of respondents in those majors. Table 3 illustrated that students in other majors do not have the same view of skill requirements to be an accounting major as do accounting majors. Table 5 presents information from the other perspective, do accounting majors have the same view of skill requirements to be in another business discipline as do students who choose a major in that discipline. Accounting student perceptions of the math skill requirement of the finance and marketing majors (6.25 and 4.70, respectively) are not significantly different from the perceived level of skill requirement by students in those majors (6.34 and 4.73, respectively). On the other hand, accounting
195
Examination of Business Student Perceptions
Table 5. Difference between Perceived Level of Skill Required for Business Disciplines by Students in that Major as Compared to Accounting Major Skill-Level Perceptions of that Major. By Accounting Major (N ¼ 284)
By Respondent in that Major
Finance (n ¼ 135) Math Writing Oral communication
6.25 4.86 5.41
6.34 4.91 5.43
Information systems (n ¼ 58) Math Writing Oral communication
5.02 5.01 5.23
5.36* 4.55** 4.97
Management (n ¼ 141) Math Writing Oral communication
4.66 5.78 6.56
5.01** 5.72 6.43
Marketing (n ¼ 172) Math Writing Oral communication
4.70 5.92 6.46
4.73 5.80 6.43
Scale: 1 ¼ almost none, 7 ¼ a great deal. *Significant difference from Accounting respondents, po.05. **Significant difference from Accounting respondents, po.01.
student perceptions of the math requirement for information systems and management majors is less than the math skill requirement perceived by students in those majors (5.02 versus 5.36, po.05; and 4.66 versus 5.01, po.01, respectively). The accounting major student perceptions of writing and oral communication skills were not significantly different from the perception of students in those majors, with one exception, information systems. Information systems students perceive that they have a lower writing requirement (4.55) than is perceived by accounting majors (5.01), po.05. Thus , the answer to Research Question 4 (Do students who choose the accounting major perceive the same level of math, writing, and oral communication skill requirements in alternative business majors as do students who have chosen those majors?) is generally positive. Students who choose to major in accounting perceive (in 9 of 12 comparisons) essentially the same level of skill requirements in alternative business majors as do students in those majors. Table 6 provides the measures of apprehension toward each of the three skills as exhibited by each major. Accounting majors have a significantly
196
WILDA F. MEIXNER ET AL.
Table 6.
Mean Apprehension by Major. Math
Accounting (n ¼ 284) Finance (n ¼ 135) Information systems (n ¼ 58) Management (n ¼ 141) Marketing (n ¼ 172)
Writing
Mean
Standard deviation
2.67 2.67 2.76**
.26 .24 .21
2.77** 2.76**
.27 .23
Mean
Oral
Standard Deviation
Mean
Standard Deviation
2.78 2.70 2.95
.80 .84 .86
3.03 2.83* 3.15
.77 .80 .81
2.64 2.49**
.84 .77
2.97 2.69**
.81 .73
Scale: 1 ¼ low apprehension, 5 ¼ high apprehension. *Significant difference from Accounting respondents, po.05. **Significant difference from Accounting respondents, po.01.
lower ( po.01) level of math apprehension (2.67) relative to the information systems (2.76), management (2.77), and marketing (2.76) majors. Accounting majors have the second-highest level of writing (2.78) and oral communication (3.03) apprehension, behind information systems majors (2.95 and 3.15, respectively); however, the level of writing and oral communication apprehension for accounting majors is not significantly different from the level for information systems majors. The accounting majors’ writing apprehension (2.78) is significantly higher than the writing apprehension of marketing majors (2.49, po.01). In addition, the accounting majors’ oral communication apprehension (3.03) is significantly higher than the oral communication apprehension of finance majors (2.83, po.05) and marketing majors (2.69, po.01). Thus, the answer to Research Question 5 (Do accounting majors exhibit a lower level of math apprehension and a higher level of writing and oral communication apprehension when compared to other business majors?) is mixed. In 6 of the 12 instances, the level of apprehension exhibited by accounting major students is in the expected direction (lower math apprehension and higher writing and oral communication apprehension than students in other majors).
CONCLUSION The results of the current study indicate that there may be numerous perceptual differences among business students with regard to the skills required to be successful in various business majors. The combination of the
Examination of Business Student Perceptions
197
different perceptions regarding skills required for success with the different levels of apprehension pertaining to these skills could partially explain why accounting is not attracting students with a different mix of skills. The information from this study provides evidence that accounting majors’ perceptions of skills required in other business majors differ substantially from their perceptions of skills necessary to be successful in accounting. Accounting majors were found to perceive that the accounting program requires greater math skills than other majors (except finance) and less communication skills (other than oral communication in finance and information systems). These perceptions of accounting majors must be interpreted carefully in light of the relatively low level of math apprehension and high communication apprehension exhibited by accounting majors. If these findings were supported by the accounting students having significantly higher communication apprehension, the results would be consistent with previous research that observed students to avoid majors where they have a high level of apprehension (Daly & McCroskey, 1975; Daly & Shapiro, 1978; Resnick et al., 1982; Bennett & Rhodes, 1988). However, the accounting majors are only observed to have higher communication apprehension than marketing majors (writing and oral) and finance majors (oral). Otherwise, the level of communication apprehension of the accounting majors was not significantly different from the other business majors. As a result, the apprehension and skills relationship seem to only explain why the accounting majors did not choose to be a marketing major. Although not tested, other business majors may perceive the skill levels necessary to succeed in accounting to be substantially different from the skill levels necessary to be successful in his/her particular major. For example, marketing majors indicated that a skill requirement for the accounting major to be 6.07, 4.11, and 4.68 for math, writing, and oral communications, respectively, while the marketing major skill level requirements are 4.73, 5.80, and 6.43, respectively, for those same factors. Respondents planning to major in a business field other than accounting generally perceive that their chosen major requires more communication skills (writing and oral) than is required in accounting. The exception is the information systems major where a greater level of writing skill but a lower level of oral communication skill is perceived to be required. This observation is generally consistent with lower levels of writing and oral communication apprehension observed for the finance, marketing, and management majors in comparison to the accounting majors’ level of apprehension. Information systems majors had a higher level of writing and oral communication apprehension than accounting majors. The
198
WILDA F. MEIXNER ET AL.
perceived math skill requirement for respondents in the information systems, management, and marketing majors are less than accounting, while the perceived math skill requirement of finance majors is higher than the finance majors’ perception of the math skill requirement in accounting. These observations are also consistent with the levels of math apprehension observed, for example, finance math apprehension is equal to accounting, while information systems, management, and marketing is higher than accounting. The motivation of this paper is that a student’s selection of an academic major is based, in part, on the skills perceived to be required to succeed in each major and the student’s apprehension to that skill area. Students who choose each of the academic majors seem to agree that success in the accounting major requires more math skill than writing and oral communication skill. Students who choose the finance, management, and marketing majors disagree with accounting majors when comparing the level of writing and oral communication skills required in accounting. On the other hand, accounting major respondents, in general, perceived the same level of skill requirement in other business majors as did students in those majors. If accounting faculty desire to attract students to the accounting major who have greater communication skills, the perception of students regarding the skills required to succeed in accounting must change. We believe that such a change can only occur slowly. There are several potential contributors to the continuation of the perceptions students have about the accounting major. First, high school accounting courses and career guidance counselors are possibly the initial source of the observed perceptions. A student’s initial exposure to accounting can cause some to self-select out of accounting before they have an opportunity to learn that accounting is much more than bookkeeping. Cooperation with the accounting profession may possibly be needed if a change is going to occur in this area. Endeavors such as the AICPA financial literacy program may be useful in helping potential accounting majors understand that the accounting profession is about more than quantitative analysis and journal entries. Accountants emphasizing the use of accounting information in everyday situations may show a relevance that some students do not understand. A second source of the continuation of current perceptions is the manner in which the introductory accounting courses are taught. While the accounting faculty at many schools have changed the introductory-level courses to a user orientation, other schools still teach these courses in a preparer orientation focusing on the memorization of computations and journal entries. Teaching these courses in a manner that emphasizes computations and memorization over thinking and communication skills
Examination of Business Student Perceptions
199
may reinforce the student’s perception that communication skills have limited use in the accounting major and accounting profession. In reality, accounting faculty may be part of the profession’s problem if the curriculum changes do not reflect the skills needed to succeed in practice. A third possible reason the observed perceptions may be perpetuated are the opinion of faculty in other business disciplines. Business faculty outside of accounting may possibly share obsolete views regarding the skills necessary to be a successful accounting professional. Students may receive information and form opinions regarding accounting and other business majors from these faculty either in class or from faculty advisors. Regardless of the source of information, inaccurate perceptions by students may result in failure to attract students to the accounting major who have highly desirable skills. The results and conclusions of this study should be interpreted in light of several research limitations. First, the data were gathered by surveying sophomore students in an accounting course. While every effort was made to gather the data in an unbiased manner, the fact that the data were gathered in an accounting course could have a halo effect. Students who have not reached a decision on an academic major may have possibly misclassified themselves when choosing an academic major in the survey. In addition, some students may have been less than fully engaged when completing the survey. However, there is no reason to believe that these problems would be greater for any particular major so they should not have a direct impact on the results. Second, there are two possible data issues to consider. The data are based on student self-reported measures. In addition, the SAT variable gathered was an overall score. The quantitative and verbal variables were not gathered separately. The availability of the individual scores would have made it possible to have a direct measure of student ability. The availability of this information would have made additional analysis possible. In conclusion, anecdotal evidence suggests that accounting professionals want entry-level staff accountants to have greater communication skills, along with strong analytical skills. As a result, accounting programs should strive to graduate students who have strong skills in both areas. To accomplish this objective, the accounting profession and the accounting programs have to inform prospective students that accounting requires a wide range of skills. As a result, future research is needed to investigate ways in which student opinion can be altered so that it better reflects the skill set needed to be a successful accounting professional. Given that some of the preconceived notions may be established before students enter the university, future research may require the coordination of accounting faculty, the accounting profession, and others who influence how prospective
200
WILDA F. MEIXNER ET AL.
students perceive the accounting major such as guidance counselors, accounting high school instructors, and other important reference groups.
REFERENCES Accounting Education Change Commission. (1990). Objectives of education for accountants. AICPA. (1988). Education requirements for entry into the accounting profession. A Statement of AICPA Policies (2nd ed., revised), AICPA, New York, USA. Arthur Andersen & Co., Arthur Young, Coopers & Lybrand, Deloitte, Haskins & Sells, Ernst & Whinney, Peat Marwick Main & Co., Price Waterhouse, & Touch Ross. (1989). Perspectives on education: Capabilities for success in the accounting profession. Bennett, K., & Rhodes, S. C. (1988). Writing apprehension and writing intensity in business and industry. Journal of Business Communication, 25(1), 25–31. Cohen, J., & Hanno, D. M. (1993). An analysis of underlying constructs affecting the choice of accounting as a major. Issues in Accounting Education, 8(2), 219–238. Cronbach, L. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334. Daly, J. A., & McCroskey, J. C. (1975). Occupational desirability and choice as a function of communication apprehension. Journal of Counseling Psychology, 22, 309–313. Daly, J. A., & Shapiro, W. (1978). Academic decision as a function of writing apprehension. Research in the Teaching of English, 12, 119–126. Felton, S., Dimnick, T., & Northey, M. (1995). A theory of reasoned action model of the chartered accountant career choice. Journal of Accounting Education, 13(1), 1–19. Gul, F., Andrews, B., Leong, S., & Ismail, Z. (1989). Factors influencing choice of discipline of study–Accountancy, engineering, law and medicine. Accounting and Finance, 4, 93–101. Resnick, H., Viehe, J., & Segal, S. (1982). Is math anxiety a local phenomenon? A study of prevalence and dimensionality. Journal of Counseling Psychology, 29, 39–47. Ruchala, L., & Hill, J. (1994). Reducing accounting students’ oral communication apprehension: Empirical evidence. Journal of Accounting Education, 12(4), 283–303. Simons, K. A., Higgins, M., & Lowe, D. (1995). A profile of communication apprehension in accounting majors: Implications for teaching and curriculum revision. Journal of Accounting Education, 13, 159–176. Stanga, K. G., & Ladd, R. T. (1990). Oral communication apprehension in beginning accounting majors: An exploratory study. Issues in Accounting Education, 5(Fall), 38–50.