CONTENTS LIST OF CONTRIBUTORS
vii
REVIEWER ACKNOWLEDGMENTS
ix
EDITORIAL POLICY AND SUBMISSION GUIDELINES
xi
TAX COMPLIANCE INTENTIONS OF LOW-INCOME INDIVIDUAL TAXPAYERS Henry Efebera, David C. Hayes, James E. Hunton and Cherie O’Neil
1
DETERMINANTS OF TAX PROFESSIONALS’ ADVICE AGGRESSIVENESS AND FEES Donna D. Bobek and Richard C. Hatfield
27
BEHAVIORAL IMPLICATIONS OF ALTERNATIVE GOING CONCERN REPORTING FORMATS Chantal Viger, Asokan Anandarajan, Anthony P. Curatola and Walid Ben-Amar
53
MANAGEMENT FRAUD RISK FACTORS: AN EXAMINATION OF THE SELF-INSIGHT OF AND CONSENSUS AMONG FORENSIC EXPERTS Sally A.Webber, Barbara Apostolou and John M. Hassell
75
BUDGETARY SLACK CREATION AND TASK PERFORMANCE: COMPARING INDIVIDUALS TO COLLECTIVE UNITS James M. Kohlmeyer III and James E. Hunton
97
v
vi
BUDGET TEAM GOALS AND PERFORMANCE ANTECEDENT AND MEDIATING EFFECTS Peter Chalos, Margaret Poon, Dean Tjosvold and W. J. Dunn III
123
PERFORMANCE EVALUATIONS, WITH OR WITHOUT DATA FROM A FORMAL ACCOUNTING REPORTING SYSTEM Yin Xu and Brad Tuttle
153
UNRAVELING THE EXPECTATIONS GAP: AN ASSURANCE GAPS MODEL AND ILLUSTRATIVE APPLICATION Kimberly Gladden Burke, Stacy E. Kovar and Penelope J. Prenshaw
169
LIST OF CONTRIBUTORS Asokan Anandarajan
School of Management, New Jersey Institute of Technology, Rutgers University, USA
Barbara Apostolou
Ourso College of Business, Louisiana State University, USA
Walid Ben-Amar
School of Management, University of Ottawa, Canada
Donna D. Bobek
School of Accounting, University of Central Florida, USA
Kimberly Gladden Burke
Else School of Management, Millsaps College, USA
Peter Chalos
College of Business, University of Illinois at Chicago, USA
Anthony P. Curatola
Department of Accounting, Drexel University, USA
W. J. Dunn III (Retired)
University of Illinois at Chicago, USA
Henry Efebera (Deceased)
University of Akron, USA
John M. Hassell
Indiana University Kelley School of Business Indianapolis, Indiana University-Purdue University Indianapolis, USA
Richard C. Hatfield
College of Business, University of Texas at San Antonio, USA
David C. Hayes
Ourso College of Business, Louisiana State University, USA
James E. Hunton
Bentley College, USA and Maastricht University, NL vii
viii
James M. Kohlmeyer III
School of Business, East Carolina University, USA
Stacy E. Kovar
College of Business Administration, Kansas State University, USA
Cherie O’Neil
College of Business, Colorado State University, USA
Margaret Poon
Department of Accounting, City University of Hong Kong, China
Penelope J. Prenshaw
Else School of Management, Millsaps College, USA
Dean Tjosvold
School of Business, Lingnan University, China and Simon Fraser University, Canada
Brad Tuttle
Moore School of Business, University of South Carolina, USA
Chantal Viger
Accounting Department, University of Quebec at Montreal, Canada
Sally A. Webber
Department of Accountancy, Northern Illinois University, USA
Yin Xu
College of Business & Public Administration, Old Dominion University, USA
REVIEWER ACKNOWLEDGMENTS The Editor and Associate Editors at AABR would like to thank the many excellent reviewers who have volunteered their time and expertise to make this an outstanding publication. Publishing quality papers in a timely manner would not be possible without their efforts. Elizabeth Almer Portland State University, USA
Christie L. Comunale Long Island University-C.W. Post Campus, USA
John Anderson San Diego State University, USA
Charles Cullinan Bryant College, USA
Philip Beaulieu University of Calgary, Canada
William N. Dilla Iowa State University, USA
Jean Bedard Northeastern University, USA
Craig Emby Simon Fraser University, Canada
James Bierstaker University of Massachusetts Boston, USA
Glen Gray California State University Northridge, USA
Dennis M. Bline Bryant College, USA
Clark Hampton University of Connecticut, USA
Rich Brody University of New Haven, USA
Rick Hatfield University of Texas San Antonio, USA
Robert H. Chenhall Monash University, Australia
Mary Callahan Hill Kennesaw State University, USA
Vincent Chong University of Western Australia, Australia
Karen L. Hooks Florida Atlantic University, USA
Freddie Choo San Francisco State University, USA
James E. Hunton Bentley College, USA and Maastricht University, NL ix
x
Mike Kirschenheiter Columbia University, USA
Robert J. Parker University of New Orleans, USA
James M. Kohlmeyer III East Carolina University, USA
Michael Roberts University of Alabama, USA
Stacy Kovar Kansas State University, USA
Jacob Rose Montana State University, USA
Theresa Libby Wilfred Laurier University, Canada
Andrew J. Rosman University of Connecticut, USA
Daryl Lindsay University of Saskatchewan, Canada
Georgia Smedley University of Nevada – Las Vegas, USA
Elaine Mauldin University of Missouri, USA James Maroney Northeastern University, USA
John Sweeney Washington State University, USA
Venky Nagar University of Michigan, USA
Stan Veliotis University of Connecticut, USA
Andreas Nikolaou Bowling Green State University, USA
Sally A. Webber Northern Illinois University
Hossein Nouri College of New Jersey, USA
Kristin Wentzel La Salle University, USA
Ed O’Donnell Arizona State University, USA
John Wermert Drake University, USA
Laurie Pant Suffolk University, USA
Patrick Wheeler University of Missouri, USA
EDITORIAL POLICY AND SUBMISSION GUIDELINES Advances in Accounting Behavioral Research (AABR) publishes articles encompassing all areas of accounting that incorporate theory from and contribute new knowledge and understanding to the fields of applied psychology, sociology, management science, and economics. The journal is primarily devoted to original empirical investigations; however, literature review papers, theoretical analyses, and methodological contributions are welcome. AABR is receptive to replication studies, provided they investigate important issues and are concisely written. The journal especially welcomes manuscripts that integrate accounting issues with organizational behavior, human judgment/decision making, and cognitive psychology. Manuscripts will be blind-reviewed by two reviewers and an associate editor. The recommendations of the reviewers and associate editor will be used to determine whether to accept the paper as is, accept the paper with minor revisions, reject the paper or to invite the authors to revise and resubmit the paper.
MANUSCRIPT SUBMISSION Manuscripts should be forwarded to the editor, Vicky Arnold, at Vicky.
[email protected] via e-mail. All text, tables, and figures should be incorporated into a word document prior to submission. The manuscript should also include a title page containing the name and address of all authors and a concise abstract. Also, include a separate word document with any experimental materials or survey instruments. If you are unable to submit electronically, please forward the manuscript along with the experimental materials to the following address: Vicky Arnold, Editor Advances in Accounting Behavioral Research Department of Accounting U41A School of Business University of Connecticut Storrs, CT 06269-2041, USA xi
xii
References should follow the APA (American Psychological Association) standard. References should be indicated by giving (in parentheses) the author’s name followed by the date of the journal or book; or with the date in parentheses, as in “suggested by Earley (2000).” In the text, use the form Rosman et al. (1995) where there are more than two authors, but list all authors in the references. Quotations of more than one line of text from cited works should be indented and citation should include the page number of the quotation; e.g. (Dunbar, 2001, p. 56). Citations for all articles referenced in the text of the manuscript should be shown in alphabetical order in the reference list at the end of the manuscript. Only articles referenced in the text should be included in the reference list. Format for references is as follows:
For Journals Dunn, C. L., & Gerard, G. J. (2001). Auditor efficiency and effectiveness with diagrammatic and linguistic conceptual model representations. International Journal of Accounting Information Systems, 2(3), 1–40.
For Books Ashton, R. H., & Ashton, A. H. (1995). Judgment and decision-making research in accounting and auditing. New York, NY: Cambridge University Press.
For a Thesis Smedley, G. A. (2001). The effects of optimization on cognitive skill acquisition from intelligent decision aids. Unpublished doctoral dissertation, University.
For a Working Paper Thorne, L., Massey, D. W., & Magnan, M. (2000). Insights into selectionsocialization in the audit profession: An examination of the moral reasoning of public accountants in the United States and Canada. Working Paper, York University, North York, Ontario.
xiii
For Papers From Conference Proceedings, Chapters From Book, etc. Messier, W. F. (1995). Research in and development of audit decision aids. In: R. H. Ashton & A. H. Ashton (Eds), Judgment and Decision Making in Accounting and Auditing (pp. 207–230). New York: Cambridge University Press.
TAX COMPLIANCE INTENTIONS OF LOW-INCOME INDIVIDUAL TAXPAYERS夽 Henry Efebera, David C. Hayes, James E. Hunton and Cherie O’Neil ABSTRACT Prior tax compliance research has largely ignored low-income individual taxpayers, as they have historically been viewed as having an immaterial impact on Federal tax revenues. However, the earned income tax credit (EITC) program has altered the Federal tax revenue landscape in this regard. The Internal Revenue Service (IRS) investigated the magnitude of EITC tax overpayments for tax year 1999 and concluded that between 27 and 31% of EITC filings were overstated, resulting in over-payments of between $8.5 and $9.9 billion (IRS, 2002). These excessive payments represented about 0.5% of total Federal revenues and 2.8% of the total tax gap. Thus, to the extent that low-income individual taxpayers intentionally under-report their incomes in order to receive higher EITC’s, the Federal budget is noticeably affected. 夽 This paper is based on Henry Efebera’s dissertation, which he completed at the University of South Florida. Upon graduation, Henry became an assistant professor at the University of Akron. Henry unexpectedly died on March 25th, 2002, leaving behind his wife, Yvonne, and three children, Omotade, Yvette, and Ebiyemi. Henry was a kind soul with a heart of gold. We miss him dearly and publish this article in his loving memory.
Advances in Accounting Behavioral Research Advances in Accounting Behavioral Research, Volume 7, 1–25 Copyright © 2004 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1474-7979/doi:10.1016/S1474-7979(04)07001-2
1
2
HENRY EFEBERA ET AL.
This study extends and complements extant tax research by examining the compliance intentions of low-income individual taxpayers. Relying on the theory of planned behavior, we examine the extent to which perceived tax equity (vertical, horizontal and exchange), normative expectations, and legal sanctions affect tax compliance intentions. Consistent with the hypotheses, the results indicate a significant positive relationship between compliance intentions and: (1) equity perceptions of the tax system; (2) normative expectations of compliance; and (3) penalty magnitude. Additionally, the findings suggest two-way interactions between penalty magnitude and exchange equity, and penalty magnitude and normative expectations. Research results reported herein hold important policy implications related to the Federal government’s efforts to reduce tax cheating and increase compliance among low-income individual taxpayers.
INTRODUCTION Deliberate tax non-compliance is believed to be relatively widespread and it represents a serious problem to the U.S. fiscal system (Smith & Kinsey, 1987; Worsham, 1996). Although prior research has provided useful insight into factors associated with tax compliance in general, it has largely ignored low-income individual taxpayers and instead focused almost exclusively on middle- and upper-income individual taxpayers and business taxpayers. Low-income individual taxpayers have been understudied in the tax literature possibly because they are typically offered little or no tax-cheating incentives, rarely encounter tax avoidance opportunities, and yield a relatively small effect on tax revenue. However, with the significant expansion of the earned income tax credit (EITC),1 the incentive, opportunity and impact of non-compliance has significantly increased for this group of taxpayers in the last decade. For example, the General Accounting Office (1997) reported that the cost of the EITC program increased by 40% between 1993 and 1994 to $21 billion and attributed a significant part of this increase (25%) to fraud. The IRS concluded that between 27 and 31% of EITC filings were overstated in tax year 1999, which resulted in overpayments of between $8.5 and $9.9 billion, and represented about 0.5% of the total Federal revenues and 2.8% of the total tax gap (IRS, 2002).2 The high rate of fraud and overstatement motivated Congress to appropriate $145 million of additional funding and personnel for the EITC compliance effort in the 2001 budget. Furthermore, the Welfare Reform Act of 1997 added millions of new low-wage workers to the workforce, potentially increasing the financial impact of the program.
Tax Compliance Intentions of Low-Income Individual Taxpayers
3
In light of such concerns and increasing Congressional and IRS interests in low-income individual taxpayers compliance behavior, research that provides insights into the factors associated with their compliance intentions could have important policy implications. While prior research has advanced our understanding of compliance behavior in general, this study contributes to extant literature in three important ways. First, while most studies have examined the effect of overall equity perceptions of the tax system on tax compliance intentions, the present study extends this literature by examining the effects of vertical, horizontal, and exchange dimensions of equity perceptions (Jackson & Milliron, 1986; Moser et al., 1995). Vertical equity refers to the perceived tax burden of lower, as compared to higher, income taxpayers; horizontal equity concerns the perceived tax burden of one taxpayer to other taxpayers with comparable economic means; and, exchange equity refers to the perceived benefits taxpayers receive relative to the taxes they pay. Second, this study expands the usual conceptualization of external social influences (subjective norms), for it also includes an internal personal dimension (moral norms), as suggested by Beck and Ajzen (1991). While subjective norms refer to the perceived social influence to engage in a given behavior, moral norms reflect the internalized moral conscience that guides an individual’s behavior (Beck & Ajzen, 1991). Third, this study examines the tax compliance intentions of an under-represented but increasingly important segment of the population – low income individual taxpayers – within the framework of a comprehensive tax compliance model. A sample of low-income individual taxpayers was presented with a hypothetical tax scenario involving two sources of income: salary income subject to IRS reporting and self-employment income not subject to third party reporting. Each participant was also presented with a table indicating the amount of refundable credit corresponding to a given level of reported income. Participants were then asked to make a compliance decision. Prior to receiving the compliance decision task, participants indicated their perceptions of vertical, horizontal, and exchange equity. After the decision task was completed, perceptions of normative influences and legal sanctions were obtained. Consistent with the hypotheses, the results indicate a significant positive relationship between compliance intentions, and, equity perceptions of the tax system, normative expectations of compliance, and penalty magnitude. There was an interactive effect between penalty magnitude and exchange equity, and penalty magnitude and normative expectations. Although we theorized that normative influences would be comprised of social and moral norms, factor analysis indicated that normative influences also included detection risk – the likelihood that the IRS will detect deliberate underreporting of income.
4
HENRY EFEBERA ET AL.
The remainder of this paper is organized as follows. The next section develops the theoretical framework that links tax compliance intentions to perceptions of equity, normative influences and legal sanctions and presents the study hypotheses. This is followed by a presentation of the research methodology and analysis of the research findings. The last section discusses the study’s limitations and contributions.
BACKGROUND AND THEORY Despite a long history of taxation in the U.S., knowledge of how tax evasion behavior is formed is still not fully understood. Only recently have researchers begun to examine such basic issues as the role of normative influences, equity perceptions, legal complexity and rule ambiguity in determining tax compliance (or non-compliance) behavior (Klepper & Nagin, 1989). Prior tax compliance research has used several competing approaches and theoretical frameworks such as deterrence theory, equity theory (Adams, 1965), equity-control model (Maroney et al., 1998), fiscal psychology, exchange equity theory (Moser et al., 1995) and procedural justice (Worsham, 1996). Although these models have provided useful insights into our understanding of compliance behavior in general, empirical evidence from this stream of research has not converged (Alm, 1991; Cowell, 1992; Webley et al., 1985). Most of the prior tax compliance studies share common limitations for their omission of the social situations and environments in which the taxpayers are embedded (Smith & Kinsey, 1987), simplicity and omission of taxpayers’ morality (Kaplan et al., 1997) and inappropriate experimental contexts (Christensen & Hite, 1993). The inconclusive results from prior research based on existing theoretical frameworks suggest the need for a more comprehensive theory of behavioral choice.
The Theory of Planned Behavior The theory of planned behavior provides an alternative theoretical framework that could provide a more complete understanding of compliance behavior by integrating the central provisions of existing theories (Ajzen, 1991). The theory of planned behavior was derived from the theory of reason action (Fishbein & Ajzen, 1975), which posits that individuals’ intentions toward a behavior are determined by their: (1) attitude toward the behavior; (2) subjective norms with regard to the behavior; and (3) perceived behavioral control over engaging in
Tax Compliance Intentions of Low-Income Individual Taxpayers
5
the behavior. An individual’s attitude toward a behavior refers to the degree to which the individual has a favorable or unfavorable evaluation of the behavior, and is shaped by the individual’s beliefs about the consequences associated with performing the behavior. Subjective norms refer to individuals’ perceptions of social expectations to perform or not to perform the behavior, and their motivation to comply with those perceived expectations. Generally, the more an individual perceives that important social peers would approve of certain behavior the stronger is the intention to perform that target behavior. Perceived behavioral control refers to the perceived ease or difficulty of performing a given behavior. Intention is defined as an individual’s subjective probability to perform a behavior (Ajzen, 1991) or a composite behavioral inclination toward a behavior (Smith & Kinsey, 1987). As a general rule, a stronger intention to perform a behavior is associated with a greater likelihood of engaging in the behavior. The theory of planned behavior posits that individuals’ intentions, together with their perceived control over the behavior, determine whether they will actually engage in the behavior. Unlike the theory of reasoned action whose range of applicability was restricted to willful behaviors that an individual can decide to perform or not to perform, the theory of planned behavior applies to behaviors that are not under the individual’s complete volitional control (Ajzen, 1991). While the theory of planned behavior has been used extensively in the social science literature to predict or explain several deviant behaviors, few studies have explicitly examined the attitudes and beliefs associated with tax non-compliance behaviors. Smith and Kinsey (1987) make a significant contribution to this stream of research by presenting a suggested framework for examining the theory of planned behavior in the context of tax compliance, as they propose a model of specific socio-psychological factors that taxpayers consider in making their compliance decisions. We extend and test their suggested model in the current study.
A Social-Psychological Model of Tax Compliance Behavior The research model used in this study (see Fig. 1) depicts a conceptual model of relevant social-psychological factors that taxpayers theoretically consider in forming their tax compliance intention. The model, based on the theory of planned behavior framework, proposes three sets of general constructs that shape taxpayers’ attitudes and beliefs in the tax compliance context: (1) equity perceptions of the tax system; (2) normative expectations of “important others” and moral conscience; and (3) the perceived legal sanctions associated with the particular
6 HENRY EFEBERA ET AL.
Fig. 1. Research Model of Tax Compliance Behavior.
Tax Compliance Intentions of Low-Income Individual Taxpayers
7
non-compliant behavior. The model further posits that tax compliance intentions, together with perceived legal sanctions, ultimately determine whether the taxpayer actually engages in a deliberate non-compliant behavior. Central to the model is the taxpayer’s compliance intention. In the context of this research, intention is defined as a taxpayer’s subjective probability to engage in a target behavior. As a general rule, the stronger the compliance intention, the greater the likelihood that the taxpayer will actually be compliant in their tax reporting decision. Each of the factors in the model is also conceptualized as consisting of several variables which involve analytically similar dimensions. The following sections describe the extant research on the elements of the model (perceived equity, subjective norms, and legal sanctions) and develop the research hypotheses. Equity Perceptions In establishing the linkage between equity perceptions and attitude formation, Ajzen (1985, p. 85) proposed that perception of equity is “capable of influencing people’s attitudes toward their positions in a relationship, toward their partners in the relationship, toward the relationship as a whole, toward the tasks they are to perform, and toward the person or agent responsible for the inequity.” Prior research into the role of equity perceptions on tax compliance intentions and behavior has been based primarily on equity theory (Adam, 1965), which focuses on fairness judgments of outcomes and the behavioral effect of such judgments. The theory suggests that when individuals perceive their outcomes to be inequitable, they will try to restore equity by altering their input-to-output ratio, either by reducing their input or increasing their output. Despite the intuitive appeal of this reasoning, empirical evidence linking equity perceptions of the overall tax system to compliance decisions has been mixed. While McGraw and Scholz (1991) did not find a significant relationship between the general perceptions of equity and compliance decisions, studies by Hite and Roberts (1992) and Roberts (1994) suggest that equity perceptions do influence tax compliance attitudes. For example, McGraw and Scholz (1991) examined the relationship between perceived fairness and a compliance decision in the context of a communication that either emphasized social or personal consequences of tax reform. No significant difference in the compliance decisions between the two groups was found. In contrast, Roberts (1994) reported that public service announcements significantly improved attitudes toward compliance. Results from a different stream of research (e.g. Hite & Roberts, 1992; Maroney et al., 1998) examining the fairness perceptions of specific tax provisions, have consistently found a positive relationship between perceived fairness and taxpayer compliance. Overall, tax research dealing with fairness perceptions suggests that the nature of
8
HENRY EFEBERA ET AL.
equity perceptions, as well as the psychological processes that serve to form such perceptions, are still not well understood. The current study further expands the compliance literature by considering equity perception along several dimensions that taxpayers potentially use to weigh the equity of their tax burden, as indicated by Jackson and Milliron (1986). One dimension, exchange equity, involves the perceived equity of the exchange relationship between the taxpayer and the government, or the perceived benefits that the taxpayer receives for the tax dollars given. Perceived exchange inequity occurs when the taxpayers’ inputs (taxes) are perceived to be greater than their outputs (benefits). The second dimension, horizontal equity, refers to the taxpayers’ perceived equity of their tax burden as compared to other taxpayers with equivalent economic means. Horizontal inequity arises when taxpayers perceive that their share of the tax burden is disproportionately larger than other taxpayers in similar economic circumstances. The third dimension, vertical equity, refers to the taxpayers’ equity perception of their tax burden in relation to other taxpayers with more income. Vertical inequity arises when lower-income taxpayers perceive that their share of the tax burden is greater than higher-income taxpayers. Given the relatively few studies that have examined the effect of the different dimensions of perceived equity on tax compliance intentions or behavior, little is known about the specific equity dimensions on which taxpayers base their intentions in tax compliance contexts. Although Moser et al. (1995) provide preliminary evidence suggesting that horizontal and exchange equity are important factors in compliance decisions, their experimental study was based on economic conception of equity rather than taxpayers’ perceptions of equity. This study examines what types of equity perceptions, if any, influence low-income individual taxpayers’ compliance intentions. Since there is no empirical evidence indicating what dimensions of equity low-income taxpayers use in making their compliance decisions, we rely on three different perceptions of equity as a basis for our hypothesis. Equity theory would assert that higher perceptions of equity should lead to stronger intentions to engage in the behavior of interest. Thus, the following multi-part hypothesis is offered (all hypotheses are listed in the alternate form): H1 . There will be a positive relationship between tax compliance intentions and perceptions of (H1a ) vertical equity (H1b ) horizontal equity and (H1c ) exchange equity. Normative Influences The second component of the research model is the taxpayers’ normative expectations toward compliance. Many studies that have examined the effects of
Tax Compliance Intentions of Low-Income Individual Taxpayers
9
normative influences on behavioral choices have conceptualized such influences in terms of social comparison. In the social psychology literature, this social comparison is referred to as subjective norms. Thus, subjective norms refer to perceived social pressure from referent others to engage (not engage) in a specific behavior (Beck & Ajzen, 1991). It indicates an individual’s perception of the expectations of people who are important to the individual (spouse, family, social peers, co-workers, etc.) with respect to the target behavior (Randall & Gibson, 1991). As a general rule, the more an individual perceives that important social peers would approve of a behavior, the stronger is the individual’s intention to engage in the behavior. Many studies that have investigated this relationship between behavioral choices and subjective norms have generally found that they explain a significant proportion of individual’s intention towards risky driving behavior (Parker et al., 1992), shoplifting and lying (Beck & Ajzen, 1991), and ethical intentions (Randall & Gibson, 1991); although the findings have not been unanimous (Beck & Ajzen, 1991). Applying this reasoning to the tax context would suggest that as the perceived subjective norm toward compliance increases, compliance intentions and related behaviors would also increase. Conversely, when taxpayers perceive that their referent others (peers, spouses, family) would encourage or at least approve of deviant behavior, the taxpayers’ intentions toward non-compliance would also be expected to increase. In spite of the intuitive appeal of this proposition, very few studies have examined this relationship in a tax compliance context. For those studies that have examined this relationship, social norms were not the focus of the research and social norms were conceptualized very narrowly, i.e. peer influence (Maroney et al., 1998). The current study incorporates several facets of social norms (i.e. family, significant others and friends) and proposes the next hypothesis. H2a . There will be a positive relationship between tax compliance intentions and social norms. Regarding the role of normative influences in tax compliance intentions, other researchers have focused on the external social dimension to the exclusion of the equally important internal dimension of moral norm. Ajzen (1991) suggests that the inclusion of personal feelings of moral obligation or responsibility toward compliance would significantly increase the explanatory power of most models of behavioral choice in socially sensitive circumstances. Although prior research has separately considered the roles of subjective norms and morality on compliance to varying degrees, no study has examined both factors in the same framework, therefore precluding the examination of the relationship between them. The current study expands the literature by considering the independent and collaborative roles of moral norms.
10
HENRY EFEBERA ET AL.
Although research on the role of morality on tax compliance intentions is sparse, the results suggest that moral norms are important in understanding taxpayers’ compliance intentions and behavior. For example, Kaplan et al. (1997) found that individuals’ moral reasoning capacity moderated the effect of IRS’ compliance strategies on their compliance behavior. They found that legal sanctions were effective in reducing non-compliance intentions for taxpayers who have low sense of moral responsibility, while “appeals to moral conscience” communication were more effective for taxpayers with high sense of moral responsibility (Schwartz & Orleans, 1967); but, the results have not been unanimous (McGraw & Scholz, 1991). This study examines a similar proposition that morality affects low-income individual taxpayers’ compliance intentions with the following hypothesis: H2b . There will be a positive relationship between tax compliance intentions and moral norms. Legal Sanctions Prior tax compliance research has generally examined the effect of legal sanctions on tax compliance along two dimensions – perceptions of detection risk and the penalty magnitude associated with the behavior. Detection risk refers to the likelihood that the IRS will detect the tax non-compliant behavior. On the other hand, penalty magnitude refers to the perceived magnitude or severity of the penalty in terms of fines and jail terms associated with the detection of the tax non-compliant behavior. Those studies examining the effect of detection risk on non-compliance intentions and behaviors have generally found mixed results (Roth & Witte, 1985; Webley et al., 1985). In further exploring this issue, Fischer et al. (1992) suggest that the mixed results from prior research are due to problems with the way that detection risk is conceptualized and recommended that future research focus on perceptions of detection risk rather than objective measures of audit probabilities. In contrast, research examining the effect of sanctions magnitude on noncompliant behavior has generally been more positive (Carnes & Englebrecht, 1995; Witte & Woodbury, 1985), although the results have not been unanimous (e.g. Klepper & Nagin, 1989). This latter group of researchers argue that an increase in sanctions magnitude may reduce tax non-compliance only under specific conditions, for example, if the personal cost resulting from the increase is significant. The current study contributes to the literature by examining both detection risk and penalty magnitude, which leads to the final hypothesis. H3 . There will be a positive relationship between tax compliance intentions, and, perceived detection risk (H3a ) and perceived penalty magnitude (H3b ).
Tax Compliance Intentions of Low-Income Individual Taxpayers
11
RESEARCH METHOD Participants Prior behavioral tax research studies have been criticized for not paying significant attention to the appropriateness of the participants used in their research (O’Neil & Samelson, 2001). In order to obtain suitable participants for a study of the tax compliance intensions of low-income individual taxpayers, residents of a large government housing project in a metropolitan area in the Southeast participated in this study. Participants were asked to review a hypothetical tax scenario that involved two components of income (salary and self-employment income). As an incentive to participate in the study, each participant received a coupon from a local fast-food restaurant.3 Of the 197 questionnaires collected, there were 146 usable responses.4 Sample demographics, presented in Table 1, indicate that most participants (93.8%) had received an EITC credit in one or more of the past five years. Additionally, the majority of the respondents (96.6%) failed to report all income in at least one of the past five years. Table 1. Demographics. Sample Size = 146
Percentage of Total
Earned income Less than $5,000 $5001–$10,000 $10,001–$15,000 $15,001–$20,000 $20,001–$25,000 $25,001–$30,000 $30,001–$40,000
11.0 11.0 13.0 14.4 15.8 20.5 14.3
Age 15 – 20 21 – 30 31 – 40 41 – 50 51 – 60
11.7 44.5 30.1 11.6 2.1
Gender Male Female
40.4 59.6
Education Less than high school High school
19.9 41.1
12
HENRY EFEBERA ET AL.
Table 1. (Continued ) Sample Size = 146
Percentage of Total
Two years of college Four years of college Graduate
36.3 0.7 2.0
Job description Unskilled Semi-skilled Skilled Professional
38.4 7.5 34.9 19.2
Marital status Married Single/head of household
32.2 67.8
Number of years received EIC 0 1 2 3 4 5
6.2 7.5 13.7 14.4 11.0 47.2
Number of years failed to report all income in the past five years 0 1 2 3 4 5
3.4 2.1 4.8 8.2 25.3 56.2
Race/ethnicity White Black Hispanic American-Indian Asian-Pacific Islander
49.9 33.6 11.0 0.7 4.8
Task and Procedures Participants responded to a three-part questionnaire. First, respondents’ perceptions of the equity of the U.S. Federal tax system were assessed. This was done before the decision task, so that the task itself did not bias their equity perceptions. Next, participants were presented with a hypothetical tax scenario involving the reporting of salary and self-employment income. Participants were told that the employer sends salary income information to the IRS at the end of the year, but that
Tax Compliance Intentions of Low-Income Individual Taxpayers
13
reporting self-employment income is the responsibility of the taxpayer. In order to remind participants of the incentives for under-reporting their self-employment income, they were presented with a table that showed the amount of refundable EITC they would receive based on the amount of self-employment income reported. After reading the scenario and EITC table, participants indicated how much of the $12,000 self-employment income they would report. The final section of the questionnaire was used to collect demographic data. Variable measures were randomized and often reverse-coded in order to increase the internal validity of the measurement scales. The compliance decision environment was structured in accordance with the phase-out range of the EITC program, where additional reported income reduces the amount of refundable credit until a complete phase-out level is reached.5 In the phase-out range, low-income individual taxpayers have an incentive to understate their income in order to increase the EITC refund and reduce their self-employment taxes. This scenario was selected because preliminary interviews with low-income taxpayers suggested that many of them supplement their income by engaging in self-employment activities (street vending, music and performing art, cosmetology, etc) where they are typically paid on a cash basis with no reporting by the payer to the IRS.
Measured Variables This section discusses the research metrics. Presented in Table 2 are the item wordings, reliability estimates, mean and standard deviations for each variable. Tax Equity Participants’ indicated their perceptions of vertical equity (three items), horizontal equity (three items) and exchange equity (three items). The vertical and horizontal items were adapted from Jackson and Milliron (1986) and the exchange equity items were adapted from Yankelovich et al. (1984). The scales were oriented such that 1 equals very unfair and 7 equals very fair. Social Norms Social norm perceptions were measured using three items reflecting the perceived expectations of “important others.”6 A high score suggests that one would feel social pressure to properly report the additional income. Moral Norms Moral norms were assessed using three items that were adapted from Ajzen (1991). A high score suggests that one would feel guilty if income were under-reported.
14
Table 2. Descriptive Statistics and Reliability Estimates.a Item #
Item Wording
Vertical equity
VE1
How fair or unfair is the amount of federal income taxes that you pay when compared to people who make more money than you? (1 = Very Unfair, 7 = Very Fair) People like me pay a larger share of our incomes in federal taxes than do rich taxpayers. (1 = Strongly Agree, 7 = Strongly Disagree) Rich taxpayers pay a larger share of their incomes in federal taxes than do taxpayers like me. (1 = Strongly Agree, 7 = Strongly Disagree) I pay about the same amount of federal income taxes as other people who make about the same income as I do. (1 = Strongly Disagree, 7 = Strongly Agree) Most people who earn about the same income as I do pay more taxes than I do. (1 = Strongly Agree, 7 = Strongly Disagree) I pay more taxes compared to most people who make about the same income as I do. (1 = Strongly Agree, 7 = Strongly Disagree) How fair or unfair is the amount of federal income taxes that you pay when compared to the amount of services you get back from the federal government? (1 = Very Unfair, 7 = Very Fair) I pay more in federal income taxes than I receive in services from the federal government. (1 = Strongly Agree, 7 = Strongly Disagree) I am satisfied with the amount of benefits I receive from the federal government compared to the amount of taxes I pay. (1 = Strongly Disagree, 7 = Strongly Agree). My family (father/mother/brother/sister) will expect me to report the additional $12,000 income from my part-time business in my tax return. (1 = Strongly Disagree, 7 = Strongly Agree) My significant other (wife/husband/boyfriend/girlfriend) will expect me to report the additional $12,000 income from my part-time business in my tax return. (1 = Strongly Disagree, 7 = Strongly Agree) My friends will expect me to report the additional $12,000 income from my part-time business in my tax return. (1 = Strongly Disagree, 7 = Strongly Agree) I will feel guilty if I do not report the additional $12,000 income from my part-time business in order to receive a larger tax refund (1 = Strongly Disagree, 7 = Strongly Agree)
VE2 VE3 Horizontal equity
HE1 HE2 HE3
Exchange equity
EE1
EE2 EE3 Social norms
SN1
SN2
SN3 Moral norms
MN1
Alphab
0.73
0.75
0.79
0.78
Mean
Std. Dev.
4.77
1.75
4.58
1.82
4.37
1.99
3.70
1.56
4.66
1.45
3.64
1.35
4.75
1.82
4.73
2.01
4.74
1.93
2.84
1.79
3.48
1.90
3.51
1.92
3.39
2.23
HENRY EFEBERA ET AL.
Subconstruct
MN3 Detection risk
DR1 DR2
DR3 Penalty magnitude
PM1
PM2
PM3
Tax compliance intentions
TCI1
TCI2
It is against my personal principles NOT to report the additional $12,000 part-time business income in order to receive a larger tax refund. (1 = Strongly Disagree, 7 = Strongly Agree) It is wrong NOT to report the additional $12,000 part-time business income in order to receive a larger tax refund. (1 = Strongly Disagree, 7 = Strongly Agree) How likely would the IRS find out if you don’t report the additional $12,000 part-time business income in your tax return (1 = Very Unlikely, 7 = Very Likely) In this age of computers, the IRS will find out if I don’t report the additional $12,000 part-time business income in my tax return (1 = Strongly Disagree, 7 = Strongly Agree) The chance that I will be caught if I don’t report the additional $12,000 additional income from my part-time business is (1 = Very Low, 7 = Very High) I would be in serious trouble if the IRS found out that I did not report the additional $12,000 business income in order to receive a larger tax refund. (1 = Strongly Disagree, 7 = Strongly Agree) The IRS would severely punish me if they found out that I did not report some or all of the $12,000 additional business income in my tax return in order to receive a larger tax refund. (1 = Strongly Disagree, 7 = Strongly Agree) How serious would the punishment be if the IRS found out that you did not report some or all of the additional $12,000 income from your part-time business? (1 = Very Mild, 7 = Very Serious). If you were in Bobby’s situation, how much of the $12,000 additional business income would you report? (1 = $0, 2 = $1–$2,000, 3=$2,001–$4,000, 4=$4,001–$6,000, 5=$6001–$8,000, 6=$8,001–$10,000, 7=$10,001–$12,000) If you were Bobby, how likely is it that you would report the additional $12,000 income from the part-time business? (1 = Very Unlikely, 7 = Very Likely)
0.60
0.85
0.86
0.76c
3.99
2.11
2.91
1.91
4.26
1.76
4.02
1.81
4.16
1.73
2.54
1.55
2.59
1.56
2.52
1.45
4.20
2.38
4.08
2.21
Tax Compliance Intentions of Low-Income Individual Taxpayers
MN2
a
Each variable was assessed using a 7-point likert scale. Reliability estimates reflect Cronbach’s alpha. c Pearson correlation. b
15
16
HENRY EFEBERA ET AL.
Detection Risk Three items, developed and pilot-tested specifically for this study, were used to measure the likelihood of getting caught under-reporting income. A high score suggests a high likelihood of getting caught or being detected. Penalty Magnitude Penalty magnitude was assessed using three items developed and pilot-tested specifically for this study. A high score suggests a strong or tough penalty for under-reporting income. Dependent Variable – Tax Reporting Intentions The decision task dealt with how much of the $12,000 in self-employment income participants would report. The pilot study revealed that participants might not respond truthfully if they felt that the researcher would inform the IRS of their non-compliant intentions. Despite numerous attempts to convince the pilot participants that their responses could not be specifically identified to them and that the researchers would maintain complete confidentiality, the participants nevertheless indicated their reluctance to respond truthfully. Hence, tax compliance intention was indirectly measured by asking participants to assume the role of a fictitious person, Bobby Smith, and to indicate the amount of self-employment income that Bobby would report. In addition, participants were asked the likelihood they would report the additional income if they were Bobby. Thus, the tax compliance items approximate asking the participants how much they would report without triggering any fear of retribution from first-person truthful reporting.
RESEARCH RESULTS Factor Analysis The inter-item reliability estimates shown on Table 2 were at or above the recommended level of 0.60 (Carmines & Zeller, 1979) indicating acceptable convergent validity. To test for divergent validity, the response items were factor analyzed. Results of the principle components analysis (Varimax rotation) are presented on Table 3. Only factors with eigenvalues greater than or equal to one were retained. The factor analysis accounted for 67.8% of the overall variance among response items. Factor one included the social norm items, detection risk items, and one moral norm item. One way to interpret this construct is that the social norms and
Tax Compliance Intentions of Low-Income Individual Taxpayers
17
Table 3. Results of Factor Analysis on Measured Variables. Subconstruct
Item #
Factor 1
Factor 4
Factor 5
Factor 6
Principle components analysis – varimax rotated component matrix Vertical equity VE1 −0.04 0.59 0.04 VE2 −0.03 0.23 0.09 VE3 0.13 0.15 −0.03 Horizontal equity HE1 0.10 0.25 0.04 HE2 −0.06 0.31 0.02 HE3 −0.18 0.54 0.01 Exchange equity EE1 0.05 0.86 0.02 EE2 0.18 0.65 0.11 EE3 −0.01 0.83 −0.02 Social norms SN1 0.67 −0.02 0.17 SN2 0.77 0.02 0.13 SN3 0.74 0.11 0.10 Moral norms MN1 0.74 0.11 0.22 MN2 0.28 −0.06 0.12 MN3 0.24 −0.18 0.15 Detection risk DR1 −0.75 −0.04 −0.15 DR2 −0.80 −0.02 −0.06 DR3 −0.81 0.09 −0.15 Penalty magnitude PM1 0.30 −0.04 0.84 PM2 0.29 0.10 0.84 PM3 0.12 0.07 0.84
0.49 0.82 0.85 −0.05 0.08 0.03 0.09 0.05 0.20 0.02 −0.05 −0.05 0.24 −0.02 0.24 0.05 −0.04 −0.16 0.07 −0.01 0.02
0.03 −0.04 −0.05 0.76 0.76 0.35 −0.10 −0.01 −0.01 0.26 0.06 0.12 0.15 −0.06 0.37 −0.02 0.12 0.16 0.07 0.06 −0.05
−0.02 −0.04 0.09 0.22 0.26 0.40 −0.05 −0.13 0.06 −0.03 0.12 −0.16 0.18 0.77 0.44 −0.09 −0.14 −0.20 0.09 0.10 −0.01
Eigenvalue Percent of variance Cumulative variance
1.56 7.41 56.54
1.32 6.30 62.84
1.06 4.84 67.68
5.45 25.95 25.95
Factor 2
3.23 15.38 41.33
Factor 3
1.67 7.80 49.13
Note: Factor Interpretations: Factor 1: Normative Expectations; Factor 2: Exchange Equity; Factor 3: Penalty Magnitude; Factor 4: Vertical Equity; Factor 5: Horizontal Equity; Factor 6: Moral Norm Toward Peers.
detection risk sub-constructs reflect normative expectations from two perspectives – one from important others and the other from the Federal government. The moral norm item represents a personal normative expectation. Accordingly, the first factor “normative expectations” was labeled as originally conceived (see Fig. 2). The next four factors mostly reflect exchange equity (factor 2), penalty magnitude (factor 3), vertical equity (factor 4) and horizontal equity (factor 5). The sixth factor is a combination of horizontal equity and moral norms. Since horizontal expectations deal with perceived equity of the tax system among peers, the last construct is labeled “moral norm toward peers.”
18 HENRY EFEBERA ET AL.
Fig. 2. Research Model Results.
Tax Compliance Intentions of Low-Income Individual Taxpayers
19
Multiple Regression Results (H1 , H2 , H3 ) Factor scores were used in the multiple regression analysis rather than construct indices composed of the sum or average of item responses within each construct. Using factor scores eliminates inter-correlated error terms (multicolinearity) among independent variables. The hypotheses predicted positive relationships between tax compliance intentions and perceived equity of the tax system (H1a , H1b , and H1c ), normative expectations (H2a , and H2b ) and legal sanctions (H3a , and H3b ). To test these hypotheses, tax compliance intention was regressed on the six factors arising from the principle components analysis. Additionally, all possible interactions were tested. The significant results are presented in Table 4 and illustrated in Fig. 2. As indicated in Table 4, all six constructs arising from the factor analysis are significantly positively related to tax compliance intentions. However, exchange equity and normative expectations cannot be interpreted in a straightforward manner, as they interact with penalty magnitude (see Fig. 2). Table 4. Regression Results of Factor Scores on Tax Compliance Intentions (Dependent Variable = Tax Compliance Intentiona ). Model
Parameter Estimates Coefficient
Intercept Main factors Normative expectations Exchange equity Penalty magnitude Vertical equity Horizontal equity Moral norm toward peers Significant interactionsb Exchange equity by penalty magnitude Normative expectations by penalty magnitude Significant covariates Income bracket Previous reporting failures
Std. Error
6.17
0.71
1.28 0.39 0.36 0.28 0.25 0.21
0.12 0.12 0.12 0.12 0.12 0.12
0.29
Beta Coefficients
t-Value
Significance ( p-Value)
8.67
0.01
0.621 0.187 0.175 0.137 0.120 0.103
10.63 3.33 3.10 2.43 2.12 1.78
0.01 0.01 0.01 0.02 0.04 0.08
0.11
0.150
2.55
0.01
0.21
0.12
0.102
1.70
0.09
−0.13 −0.20
0.06 0.10
−0.122 −0.123
−2.11 −1.98
0.04 0.05
Note: R 2 = 0.59, Adjusted R 2 = 0.56, Overall F-ratio = 19.46 (p < 0.01). a The Tax Compliance Intention items (TCI1 and TCI2) were averaged to form a single compliance index. b All other two-way and three-way interactions were non-significant at p < 0.10.
20
HENRY EFEBERA ET AL.
The positive relationship between tax compliance intentions, and vertical equity (H1a ) and horizontal equity (H1b ) partially support H1 . The interaction between exchange equity and penalty magnitude also partially support H1 . That is, holding penalty magnitude constant, greater perceptions of exchange equity are associated with higher levels of tax compliance intentions (H1c ). The combined results indicate that H1 is supported. Evaluation of the second hypothesis (H2 ) is conditioned on the following interpretation. Holding penalty magnitude constant, social norms are positively related to higher tax compliance intentions (H2a ). However, only one of the moral norm items loaded on the “normative expectations” construct, hence, the expected positive relationship between moral norms and compliance intentions is weakly supported (H2b ), with the caveat that it too must be interpreted in light of the interaction. Additionally, the construct entitled “moral norm toward peers” also appears to be a subcomponent of normative expectations, as it deals with a consciousness to comply with the Federal tax code in a similar manner as peers. The positive relationship between “moral norm toward peers” and tax compliance intentions further supports H2b , again, when interpreted in proper perspective of the interaction. Unexpectedly, the three detection risk items loaded on the normative expectations factor. Detection risk can be viewed as a form of normative expectations or oversight from an external party (the Federal government), even though this relationship was not hypothesized. Overall, research findings suggest that H2 is partially supported. The third hypothesis is only partially supported as well, since the detection risk items (H3a ) loaded on the normative expectations factor, and the penalty magnitude items (H3b ) loaded on their own factor. Yet, penalty magnitude must be interpreted in light of exchange equity and normative expectations; meaning, holding exchange equity constant, more severe penalties are associated with higher tax compliance intensions, and, holding normative expectations constant, greater penalties also suggest higher compliance intentions. All demographic variables were originally included in the regression model as possible covariates, but only two factors were significant (p ≤ 0.10), income bracket and previous reporting failures. Interestingly, higher reporting brackets indicated less intention to comply. Not surprisingly, greater numbers of previous income reporting failures suggested less tax compliance intentions. Overall, the sub-construct items did not load on the constructs precisely as expected (compare Fig. 1 to Fig. 2). Nevertheless, factor analysis and regression model results indicate that the constructs articulated in the theory of planned behavior (attitudes, subjective norms and perceived behavioral control) are predictive of low-income individual taxpayers’ compliance intentions.
Tax Compliance Intentions of Low-Income Individual Taxpayers
21
DISCUSSION This study examines the extent to which low-income individual taxpayers’ compliance intentions are influenced by perceptions of tax equity (vertical, horizontal, and exchange equity), normative expectations (social and moral norms) and legal sanctions (detection risk and penalty magnitude). The research model (Fig. 1) is based on the theory of planned behavior (Ajzen, 1985, 1991). While prior tax compliance research has largely ignored low-income individual taxpayers, continued significant growth of the earned EITC program has created a situation where non-compliance among low-income individual taxpayers is becoming a serious problem with growing fiscal implications (IRS, 2002). Hence, any insight into ways to increase tax compliance among these taxpayers can be helpful in setting tax policy. The research findings suggest at least two policy implications for the Federal government to consider. First, the results indicate significant positive relationships between tax compliance intentions and perceptions of vertical and horizontal equity. This implies that any attempts made by the Federal government to ensure that lower income individual taxpayers feel as though they are not paying proportionately more taxes than upper income individual taxpayers (vertical equity) or peer tax payers (horizontal equity) can positively impact tax compliance intentions. Second, legal sanctions (penalty magnitude) reveal an interactive effect with exchange equity and normative expectations. With respect to exchange equity, low-income individual taxpayers must feel as though the benefits they receive from the Federal government are proportionate to the taxes they pay (exchange equity). Positive perceptions in this regard coupled with strong penalties for non-compliance should help to improve taxpayer compliance. Regarding normative expectations, there are two aspects to consider – important others and governmental oversight. The expectations of important others are clearly important, but social expectations of this nature are socio-cultural and outside the spectrum of legislation. However, the Federal government can take steps to increase the monitoring process via tax return audits. Naturally, attempts to increase the percentage of audits for low-income individual taxpayers must be cost effective, and aligned with the audit rates of vertical and horizontal taxpayers. As with exchange equity, increased Governmental oversight coupled with stronger penalties should lead to greater taxpayer compliance. This study is limited by certain validity threats common to survey research. First, although efforts were made to maximize the realism of the task, subjects responded to a contrived scenario involving a tax compliance opportunity. To the extent that participants respond differently when making such decisions on their
22
HENRY EFEBERA ET AL.
own returns, results from the present study may have limited external validity. Second, the questionnaire solicited compliance intentions indirectly by asking them to report their intentions as if they were a hypothetical taxpayer to mitigate this possibility. Although participants were also assured that their responses would remain anonymous, due to the sensitive nature of tax compliance issues, the possibility exists that respondents were not honest in reporting their intentions. Third, the current study measures taxpayers’ compliance intentions but not the actual compliance behavior. However, comparing participant demographics to their reported intentions offers some degree of external validity to the link between tax non-compliance intentions and behaviors in the current study, as nearly 97% of respondents reported that they have understated their taxable income at least once during the past five years; thus, their responses to the experimental script are likely indicative of how they would actually behave in a similar circumstance. Future research should continue to address the link between compliance intentions and behavior. Fourth, the maximum amount of tax refund that could be obtained from under-reporting income was assumed to be the $2,200 for all participants in this study. Additional research is needed to examine how taxpayers’ decisions may be influenced by larger tax items, for Christensen and Hite (1993) suggest that taxpayers are more conservative with items involving larger liabilities than with items involving smaller liabilities. This indicates that the tax compliance intentions captured in this study may be influenced by the maximum amount of refundable credit that was made available to the taxpayer. Finally, this study dealt with an under-reporting of income scenario in order to receive a higher EITC. It is also possible for individual taxpayers to over/underreport deductions in order maximize their qualifying EITC income.7 Presumably, the measured variables used in this study would also apply to over-reporting of deductions scenarios. However, taxpayers may perceive that there is less detection risk when deductions are intentionally over-reported. This issue should be addressed in future research on the tax compliance behavior of low-income individual taxpayers.
NOTES 1. “The Earned Income Tax Credit (EITC), sometimes called the Earned Income Credit (EIC), is a refundable Federal income tax credit for low-income working individuals and families. Congress originally approved the tax credit legislation in 1975 in part to offset the burden of social security taxes and to provide an incentive to work. The credit reduces the amount of Federal tax owed and can result in a refund check. When the EITC exceeds the amount of taxes owed, it results in a tax refund to those who claim and qualify for the credit. Income and family size determine the amount of the EITC. To qualify for the credit, both
Tax Compliance Intentions of Low-Income Individual Taxpayers
23
the earned income and the adjusted gross income for 2003 must be less than $29,666 for a taxpayer with one qualifying child ($30,666 for married filing jointly), $33,692 for a taxpayer with more than one qualifying child ($34,692 for married filing jointly), and $11,230 for a taxpayer with no qualifying children ($12,230 for married filing jointly). The EITC Eligibility Checklist on the last page of IRS’ Publication 596, Earned Income Credit, may be used to quickly determine eligibility for the credit.” (Source: http://www.irs.gov/individuals/). 2. “Several years ago the Internal Revenue Service developed the concept of the ‘tax gap’ as a way to measure voluntary federal income tax compliance. The gross tax gap is the difference between taxes owed (the ‘true’ tax liability) and taxes paid voluntarily and timely for any given tax year. The net tax gap is the gross tax gap minus taxes collected through various IRS enforcement programs for the same tax year. Both gross and net individual income tax gaps consist of three main components: non-filing, underreporting and underpayment. The non-filing gap is the amount of tax liability owed by taxpayers who do not voluntarily and timely file returns. The underreporting gap is the amount of tax liability not voluntarily reported by taxpayers who do file returns. The underpayment tax gap is the amount of tax liability individuals report on returns but do not pay voluntarily and timely.” (Source: http://www.unclefed.com/Tax-News/1997/). 3. Several incentives were tested, such as state lottery tickets and small cash payments. After several pilot test trials, the food coupon was deemed to be the best received and most appreciated incentive. 4. Of the 197 responses, 51 were deleted because they reported income that would not qualify as low-income taxpayers based on the 2002 EITC guidelines (e.g. workers raising two or more children with household income of $34,000 (married) or $33,000 (single)). 5. The 20% phase-out rate for the study compares to the 15.98% phase-out rate for the EITC program. 6. In the pilot study, subjects identified spouses, family and friends in that order as the strongest influences on their ethical behaviors. 7. The IRS has recently implemented an initiative aimed at reducing cheating behavior related to the over-reporting of deductions in order to receive a higher EITC. Specifically, the IRS now requires taxpayers to include their dependents’ social security numbers on tax returns in an attempt to halt the fraudulent listing of non-existent dependent children.
REFERENCES Adams, J. S. (1965). Inequity in social exchange. Advances in Experimental Social Psychology, 2, 267–299. Ajzen, I. (1985). From intentions to action: A theory of planned behavior. In: J. Kuhl & J. Bechmenn (Eds), Action Control from Cognition to Behavior. New York: Springer Verlag. Ajzen, I. (1991). The theory of planned behavior. Organizational Behavior and Human Decision Processes, 50, 179–211. Alm, J. (1991). A perspective on the experimental analysis of tax reporting. The Accounting Review, 66(July), 577–593. Beck, L., & Ajzen, I. (1991). Predicting dishonest actions using the theory of planned behavior. Journal of Research in Personality, 25, 285–301.
24
HENRY EFEBERA ET AL.
Carmines, E. G., & Zeller, R. A. (1979). Reliability and validity assessment. Beverly Hills: Sage. Carnes, G. A., & Englebrecht, T. D. (1995). An investigation of the effect of detection risk perceptions, penalty sanctions, and income visibility on tax compliance. The Journal of American Taxation Association (Spring), 26–41. Christensen, A. L., & Hite, P. A. (1993). A study of the effect of taxpayer risk perceptions on ambiguous compliance decisions. The Journal of American Taxation Association (Spring), 1–18. Cowell, F. A. (1992). Tax evasion and inequity. Journal of Economic Psychology, 13, 521–543. Fishbein, M., & Ajzen, I. (1975). Belief, attitude, intention and behavior. In: An Introduction to Theory and Research. Boston, MA: Addison-Wesley. Fischer, C. M., Wartick, M., & Mark, M. (1992). Detection probability and taxpayer compliance: A literature review. Journal of Accounting Literaturel, 11, 1–46. General Accounting Office Reports (GAO) (1997). GAO Reports on EIC Usage. TNT 97, 119–179. (GAO/GCD-97-69). Release Date: May 16, 1997 (Doc 97-17840). Hite, P. A., & Roberts, M. L. (1992). An analysis of the tax reform based on taxpayers perceptions of fairness and self-interest. Advances in Taxation, 4, 115–137. Internal Revenue Service (IRS) (2002). Compliance Estimates for Earned Income Tax Credit claimed on 1999 Returns, February, IRS Publications. Jackson, B. R., & Milliron, V. C. (1986). Tax compliance research: Findings, problems, and prospects. Journal of Accounting Literature, 5, 125–161. Kaplan, S. E., Newberry, K. J., & Reckers, P. M. (1997). The effect of moral reasoning and educational communications on tax evasion intentions. The Journal of American Taxation Association, 19(Fall), 38–54. Klepper, S., & Nagin, D. (1989). Tax compliance and perceptions of the risks of detection and criminal prosecution. Law and Society Review, 23(2), 209–240. Maroney, J. J., Rupert, T. M., & Anderson, B. H. (1998). Taxpayer reaction to perceived inequity: An investigation of indirect effects and the equity control model. The Journal of the American Taxation Association, 20(Spring), 60–77. McGraw, K. M., & Scholz, J. T. (1991). Appeals to civic virtue vs. attention to self-interest: Effect on tax compliance. Law and Society Review, 25, 471–498. Moser, D. V., Evans, J. H., III, & Kim, C. K. (1995). The effects of horizontal and exchange inequity on tax reporting decisions. The Accounting Review (October), 619–634. O’Neil, C. J., & Samelson, D. P. (2001). Behavioral research in taxation: Recent advances and future prospects. Advances in Accounting Behavioral Research, 4, 103–139. Parker, D., Manstead, A. S. R., Stradling, S. G., & Reason, J. T. (1992). Intention to commit driving violations: An application of the theory of planned behavior. Journal of Applied Psychology, 77(1), 94–101. Randall, D. M., & Gibson, A. M. (1991). Ethical decision making in the medical profession: An application of the theory of planned behavior. Journal of Business Ethics, 10(2), 111–122. Roberts, M. L. (1994). An experimental approach to changing taxpayers’ attitudes towards fairness and compliance via television. The Journal of American Taxation Association (Spring), 67–86. Roth, J., & Witte, A. (1985). Understanding taxpayer compliance: Major factors and perspectives. Conference on Tax Administration. Internal Revenue Service (January), 57–78. Schwartz, R., & Orleans, S. (1967). On legal sanctions. University of Chicago Law Review, 34, 282–300. Smith, K. W., & Kinsey, K. A. (1987). Understanding taxpaying behavior: A conceptual framework with implications for research. Law and Society Review, 21(4), 639–663.
Tax Compliance Intentions of Low-Income Individual Taxpayers
25
Webley, P., Morris, I., & Amstutz, F. (1985). Tax evasion during a small business simulation. In: H. Brandstatter & E. Kirchler (Eds), Economic Psychology (pp. 233–242). Linz: Trauner. Witte, A. D., & Woodbury, D. F. (1985). The effect of tax laws and tax administration on tax compliance: The case of the U.S. individual income tax. National Tax Journal, 38, 1–14. Worsham, R. G. (1996). The effect of tax authority behavior on tax compliance: A procedural justice approach. The Journal of American Tax Association, 18(Fall), 19–39. Yankelovich, S., & White, Inc. (1984). Taxpayer attitudes study: Final report. Internal Revenue Service.
DETERMINANTS OF TAX PROFESSIONALS’ ADVICE AGGRESSIVENESS AND FEES Donna D. Bobek and Richard C. Hatfield ABSTRACT Prior research has identified a number of variables that influence tax professionals’ judgments. However, these variables have usually been examined in isolation. This study has two main findings. First, using a structured questionnaire that allows for the collection of variables related to actual tax planning engagements, this study validates the findings of numerous laboratory studies using factor and regression analysis. Factors representing risks and rewards associated with the client and the IRS, along with task characteristics and client aggressiveness significantly affect the aggressiveness of tax advice given to clients. Second, tax professionals do not appear to charge a premium for aggressive tax advice. However, regarding the fee charged, a significant gender effect is found even after controlling for time spent on the engagement, experience, firm size and education.
INTRODUCTION Roberts (1998) articulated a model of tax accountants’ judgment and decisionmaking (JDM) processes based on prior research. Included in this model are economic environmental factors representing the risks and rewards associated Advances in Accounting Behavioral Research Advances in Accounting Behavioral Research, Volume 7, 27–51 Copyright © 2004 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1474-7979/doi:10.1016/S1474-7979(04)07002-4
27
28
DONNA D. BOBEK AND RICHARD C. HATFIELD
with the IRS, the client, and the tax accountant’s firm. He also includes two other sets of factors, individual-psychological factors (e.g. experience, advocacy) and task inputs (e.g. ambiguity and authority). These factors are presumed to affect a tax accountant’s cognitive processing, which in turn leads to some output (e.g. a recommendation on a tax planning issue). This model is based primarily on research findings from experimental studies using mostly “Big 5” accountants as research subjects. The present study has two objectives. The first objective is to test the validity of Roberts’ model using a different data collection technique that allows for the collection of a comprehensive set of variables. In addition, the subjects are tax professionals from smaller accounting firms rather than the large firms which were more typical of the studies examined in Roberts’ (1998) review. The advantage of this approach is to generalize prior results to the type of accounting firms that do most of the tax work (Russell, 2002), and to provide external validity to results found primarily in experimental settings. Further, this approach considers variables that have not been adequately addressed by prior research. While the risks and rewards associated with the IRS and the client have received a great deal of attention, the risks and rewards associated with the tax professionals’ firm have received less attention (Roberts, 1998). The second objective of this study is to determine whether tax professionals “price” aggressiveness. Increased aggressiveness may lead to an increased risk of malpractice claims (Bandy, 1996) and taxpayer and/or tax preparer penalties. Although CPAs are generally barred from charging contingency fees (Department of Treasury, 1994), it would be economically rational to assume that some premium may be associated with advice the tax professional deems particularly aggressive. CPAs from primarily small firms responded to a structured questionnaire regarding their last tax planning engagement. Factor analysis revealed that statistically developed factors are consistent with the factors articulated in Roberts’ model. Regression analysis using the factor scores as independent variables identified four factors as influential to the aggressiveness of the tax professionals’ advice: client characteristics (e.g. size, importance), task characteristics (e.g. ambiguity, tax dollars at stake), risks and rewards associated with the IRS (e.g. concern for IRS audit, taxpayer penalties) and, especially, client aggressiveness. The only factor that was not significant was a factor representing risks and rewards associated with the tax professional’s firm (e.g. concern for client loss, and concern for professional liability). Regression analysis was also performed with fees as the dependent variable. After controlling for time spent by the tax professional and others in the firm (the biggest influence on fees), firm size and gender significantly affected fees. The larger the firm (measured as number of professionals), the higher were the
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
29
fees. Also, male accountants charged more than female accountants. Experience, education and advice aggressiveness were not significantly related to fees, leading to the tentative conclusion that tax professionals do not charge a premium for aggressive advice. The remainder of the paper is organized as follows: in the next section prior research is discussed, and the study’s research objectives are articulated. In the following section, the method of testing the research objectives is described, followed by a discussion of the results obtained. Finally, the last section provides a summary and suggestions for future research.
LITERATURE REVIEW AND RESEARCH OBJECTIVES Variables from Roberts’ Model of Tax Judgment and Decision Making There has been a great deal of experimental research investigating the aggressiveness of tax professionals’ advice (e.g. Ayers et al., 1989; Bandy et al., 1994; Carnes et al., 1996; Cloyd & Spilker, 1999; Cuccia, 1994; Cuccia et al., 1995; Duncan et al., 1989; Helleloid, 1989; LaRue & Reckers, 1989; McGill, 1990; Pei et al., 1992; Reckers et al., 1991; Schisler, 1994). Roberts (1998) synthesized these prior results and articulated a model of tax accountants’ judgment and decision-making. This model is depicted in Fig. 1.1 Roberts (1998) identified three sets of inputs that affect the tax accountant’s cognitive processing (e.g. problem identification, information search, alternative evaluation), which in turn leads to some output (e.g. aggressive tax advice). The three sets of inputs he identified were: individual-psychological factors (e.g. experience, advocacy, risk preferences), economic environmental factors (i.e. risks and rewards associated with IRS, client, and/or firm), and task inputs (e.g. ambiguity, complexity, documentation). He also identified a variety of areas that required additional research. For example, he called for additional research exploring the role that the economic environment has on tax accountants’ judgment and decision-making, particularly the effect of liability concerns. The present study focuses on the effect of these inputs on the aggressiveness of the tax advice provided by tax professionals. Individual-psychological factors of interest in this study that have been found to be related to advice aggressiveness include issue experience, years of experience, firm size, and gender. In general, experience has been correlated with increased aggressiveness (e.g. Cloyd, 1995; LaRue & Reckers, 1989; Roberts & Klersey, 1996), although it is suggested that experience is merely a proxy for other variables such as better knowledge of IRS audit probabilities (Roberts, 1998). Firm
30 DONNA D. BOBEK AND RICHARD C. HATFIELD
Fig. 1. Economic Psychology-Processing Model of Tax Accountants’ Judgment/Decision Making.
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
31
size has sometimes been associated with more aggressive advice, although Roberts points out that there is no theoretical support for this effect (Roberts, 1998). Results regarding the effect of gender on advice aggressiveness are mixed. While McGill (1990) found males to provide more aggressive advice than females, Ashton (2000) reported weakly significant results that females provide more aggressive advice.2 Task factor variables of interest include law ambiguity and issue complexity. Law ambiguity relates to the clarity of the legal precedents, while issue complexity relates to the clarity of the facts and circumstances of the tax issue. The greater the ambiguity in the tax law, the more aggressive the advice is expected to be (Helleloid, 1989; Kaplan et al., 1988; Klepper & Nagin, 1989). The effect of complexity on advice aggressiveness has not been addressed by prior research. However, similar to law ambiguity, there should be more opportunity to be aggressive as the complexity of the tax issue increased. Economic environmental factors that have been considered by prior research include concerns about IRS action, and characteristics of the client. Concern for IRS involvement has had the expected effect of reducing advice aggressiveness (see e.g. Newberry et al., 1993). Client characteristics such as size, importance, strength, and the aggressiveness of the client have been found to increase the aggressiveness of tax professionals’ advice (Roberts, 1998). Firm factors, such as concern for client loss and concern for professional liability, have not been adequately addressed by prior research, and thus are discussed in more detail. Currently, tax engagements give rise to the majority of malpractice claims filed against CPAs (Anderson & Wolfe, 2001; Wladis, 1995; Yancey, 1996). The AICPA reported that 60% of all accountant malpractice claims in the AICPA Professional Liability Insurance program arose from tax engagements (Anderson & Wolfe, 2001). This is up from 43% ten years ago. For many tax firms, the direct cost of malpractice protection is their single largest expense after employee compensation, approximating 10% of total expenses (Bandy, 1996). Professional liability as a risk factor is becoming increasingly important in the tax professional’s work environment. Practitioners are aware that an effective way of limiting malpractice liability is to carefully regulate tax return aggressiveness (Bandy, 1996). Therefore, professional liability pressures may lead to general reductions in the aggressiveness of tax positions. A second risk associated with the tax professional’s firm is the possibility that the firm will lose the client. Roberts and Cargile (1994) examine the effect that concern for client loss has on the aggressiveness of both auditors and tax professionals. They find a significant main effect for the risk of client loss. When the risk of client loss is viewed as high, their subjects’ advice was more aggressive. Their client loss variable was dichotomous and was manipulated by telling the participant that the
32
DONNA D. BOBEK AND RICHARD C. HATFIELD
perceived risk of losing the client if an expenditure is capitalized is either high or low. Further, they find that the effect of client loss is stronger in an audit context than in a tax context. Therefore, we would expect that concern for client loss will be related to advice aggressiveness. To summarize, the first objective of this study is to perform a fairly comprehensive test of Roberts’ model of tax professionals’ judgment and decision-making, using advice aggressiveness as the dependent variable. Data is collected regarding the three inputs identified by Roberts, individual-psychological, task characteristics and the economic environment. However, the “black box” portion (i.e. cognitive processing) of the model illustrated in Fig. 1 is not addressed.
Tax Professionals’ Fees There have been numerous studies that have investigated the determinants of audit fees (e.g. Behn et al., 1999; Francis & Simon, 1987; O’Keefe et al., 1994; Simunic, 1980). In addition to identifying client attributes that are related to fees (e.g. size, foreign operations), these studies have also addressed whether or not audit quality and client satisfaction are related to the level of audit fees. There are only a few studies that have discussed tax preparer fees. Christensen (1992) identified fees as a determinant of taxpayers’ perceptions of quality service. In an analytical study, Phillips and Sansing (1998) investigated whether the ban on contingent fees serves to increase compliance (their conclusion was that it does not). Frischmann and Frees (1999), in an archival study using tax return data, determined that tax return preparation fees were associated with tax savings and time savings, but not uncertainty reduction. Ashton (2000), in a unique study which examined the results of a Money magazine tax return preparation contest found that the fee that the participants said they would have charged was not related to the aggressiveness or the accuracy of their services. There were differences however, between CPAs and non-CPAs, and males and females (CPAs and males would have charged more). Although not specifically addressing fees, client importance has been characterized as a surrogate for “high future compensation” (e.g. Reckers et al., 1991), and was found to be related to advice aggressiveness. Other than the Ashton (2000) study, we know of no study that has directly investigated whether there is a link between fees and advice aggressiveness. Nevertheless, providing aggressive advice is not without cost to the tax professional. Providing aggressive advice increases the risk of IRS audit, preparer penalties and taxpayer penalties, and may also translate into a higher risk of malpractice liability (Bandy, 1996). However, there is a lack of consensus within the extant literature as to whether or not taxpayers are even seeking aggressive
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
33
advice from tax professionals. Schisler (1995) found that taxpayers in his study were more aggressive than tax professionals. On the other hand, Hite (1992) reported that taxpayers do not demand aggressive advice. With this review of the literature as a backdrop, the second objective of the present study is to determine whether the fees charged by tax professionals are related to the aggressiveness of their advice. In addition to data about fees, we collected data regarding the amount of time spent on the engagement by both the tax professional and others in the tax professional’s firm, firm size, gender, education and years of experience.
METHOD Questionnaire We developed a questionnaire which asked respondents to recall their most recent tax planning engagement. The questionnaire is reproduced in the Appendix. All of the questions relate to one particular client and engagement. This method is similar to one used by Gibbins et al. (2001). They asked auditors to recall their last negotiation process and respond to a “structured questionnaire” about their experience. While experiments obviously provide the greatest degree of internal validity as well as allow for causality conclusions, they are subject to external validity concerns. This structured questionnaire approach allowed us to “observe” a naturally occurring behavior (i.e. we obtain a “sample” of actual tax planning engagements), and to collect a wide range of variables for a variety of different tax engagements. Our first research objective considered whether or not Roberts’ model, developed from the results of experimental research using primarily Big-5 CPAs as subjects, could be validated using a different methodology and a sample of tax professionals from small firms. Since the particular tax professional output we focused on was the aggressiveness of the advice given in a tax planning engagement, the dependent measure was the subject’s response to the question, “how aggressive was the advice that you gave the client on this specific issue” answered on a 7-point Likert scale (Pei et al., 1990; Roberts & Klersey, 1996; Spilker et al., 1999). We also collected measures designed to correspond to Roberts’ individualpsychological factors,3 task factors and environmental factors. The individual tax professional variables collected were issue experience (7-points scale), years of experience, firm size (measured as number of professionals in the office), position in the firm (e.g. partner, manager, senior, staff), gender and education (e.g. Masters Degree, Bachelors Degree, etc.). The remaining variables used to
34
DONNA D. BOBEK AND RICHARD C. HATFIELD
test Roberts’ model were measured on either a 7-point or 10-point scale. Task variables collected were law ambiguity and issue complexity. Environmental variables fall into three categories: risks and rewards associated with the IRS, the firm and the client. The IRS variables collected were concern for IRS audit, concern for taxpayer penalties and concern for preparer penalties. The firm variables collected were concern for professional liability and concern for client loss. The client variables collected were client size, client importance, client strength, client relationship, client aggressiveness and tax dollars at stake. Finally, in order to test the relationship between advice aggressiveness and the fee charged the client, we asked the subjects to estimate the fee (in dollars) they charged, and the amount of time (in hours) spent by themselves and others in the firm.
Subjects Data were collected from non-Big 5 accountants.4 While these participants are not completely comparable to subjects from large firms in many prior studies, they are more representative of the population of tax professionals.5 They were contacted through the mail. Five hundred tax professionals in the Central Florida area were Table 1. Sample Demographics. Years of Experience 0–10 years 10–20 years 20–30 years Over 30 years
18% 45% 30% 7%
Education No college AA/AS BA/BS MA/MS PhD
0% 7% 56% 36% 1%
Profession CPA Enrolled agent Other
83% 9% 8%
Position in Firm Staff Senior Manager Partner/Owner
0% 3% 8% 89%
Firm Size Big 5/national/regional Not national or regional
5% 95%
# of Professionals in Office Less than 5 5–20 More than 20
66% 19% 15%
Gender Male Female
72% 28%
% Who Experienced Previous Malpractice Claim
13%
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
35
Table 2. Detail of Response Rate. Initial questionnaires mailed
509
Less Returned as undeliverable Deceased Not eligiblea
(45) (2) (4)
Adjusted sample size Returned questionnairesb Response rate (based on adjusted sample size)
458 93 20.3%
a The
instructions to the questionnaire stated that it should only be completed if tax planning was a “significant” part (defined as at least 10%) of the tax professional’s practice. Four subjects contacted us to let us know they did not meet the criteria. It is likely that this number is understated; therefore the true response rate of eligible participants is likely higher than what is reported here. b All returned questionnaires were useable.
identified as potential participants. To encourage participation, participants were allowed to enter a drawing for free admission to a Professional Education Conference. They entered the drawing by sending a separate email message in order to preserve anonymity. Table 1 presents the sample demographics. The respondents were primarily male (72%), CPAs (83%) partners (88%), in small firms (94%) with less than 5 (66%) professionals in the office. The response rate (detailed in Table 2) was approximately 20%.
RESULTS Descriptive Statistics Table 3 presents the mean response to each variable. The mean of the dependent variable, advice aggressiveness, indicates that on their last engagement, the advice they gave was slightly more aggressive than average. They also indicated that the client was more aggressive than the average client (4.49 on a 7-point scale). The tax issue was somewhat complex and the tax dollars at stake were more than average. In general, the subjects were not particularly concerned with the risks associated with the firm or the IRS. Concern for client loss averaged just over three on a 10-point scale (where 1 = didn’t even think about it, and 10 = very concerned about it). The biggest concern was for taxpayer penalties, and that mean was still less than five.
36
DONNA D. BOBEK AND RICHARD C. HATFIELD
Table 3. Descriptive Statistics. Variable Complexitya Ambiguitya Issue experiencea Tax dollars at stakea Client sizea Client importancea Client strengtha Client relationshipa Client aggressivenessa Advice aggressivenessa Concern for preparer penaltyb Concern for IRS auditb Concern for tax payer penaltyb Concern for professional liabilityb Concern for client lossb Gender (% male) Years of experience Firm size (# of professionals in office) Fee charged Hours spent by self Hours spent by others
Mean
Standard Deviation
4.84 3.58 4.83 4.60 4.22 4.22 5.28 5.58 4.49 4.14 2.96 3.74 4.68 4.31 3.11 72% 19.75 5.46 $1,085 7.66 2.32
1.22 1.66 1.63 1.52 1.40 1.48 1.27 1.10 1.32 1.16 2.82 2.65 3.26 3.27 2.87 0.45 7.75 5.80 2,237 12.61 7.41
Note: See appendix for actual questions asked. a Measured on a 7 point scale. b Measured on a 10 point scale.
Test of Roberts’ Model Factor Analysis Results Factor analysis of the responses was performed to examine whether these measures loaded consistently with the factor descriptions provided by Roberts (1998). Table 4 reports the results of this factor analysis. Varimax rotation factor loadings are reported. Five factors were retained by the procedure. In total, the five factors explained 65.8% of the variance. Two of the five factors (Factors One and Five in Table 4) represent client characteristics (or in Roberts’ vernacular, “risks and rewards associated with the client”). Factor One includes client strength, importance, size and relationship. This factor explained 21.4% of the variance. Client aggressiveness loaded as its own factor (Factor Five), explaining 7% of the variance. Factor Two represents task characteristics. Issue complexity, issue experience, tax dollars at stake and law ambiguity loaded on this factor and explained 16.9%
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
37
Table 4. Factor Analysis Results.a Factor 1 Client strength Client importance Client size Client relationship
Factor 2
Factor 3
Factor 4
0.836 0.768 0.761 0.633
Issue complexity Issue experience Tax dollars at stake Law ambiguity
0.708 0.642 0.628 0.548
Concern for IRS audit Concern for preparer penalty Concern for taxpayer penalty
0.466 0.801 0.798 0.559
Concern for client loss Concern for professional liability
0.779 0.774
Client aggressiveness % Variance explained Factor description
Factor 5
0.899 21.4%
16.9%
12.3%
8.0%
7.0%
Client risks & rewards
Task factors
IRS risks & rewards
Firm risks & rewards
Client aggressiveness
a Varimax
rotation factor loadings are reported. One variable, Law Ambiguity, had relatively high loading scores on two factors, therefore both are reported.
of the variance. Factor Three represents risks and rewards associated with the IRS. Concern for IRS audit, concern for preparer penalties and concern for taxpayer penalties loaded on this factor and explained 12.3% of the variance. Finally, Factor Four represents risks and rewards associated with the CPA’s firm. Concern for professional liability and concern for client loss loaded on this factor and explained 8% of the variance. The only variable that did not load in a manner consistent with Roberts’ model was tax dollars at stake. Roberts viewed this variable as a risk/reward associated with the client. However, our subjects merely viewed it as a characteristic of the task.6 There were five individual variables collected: gender, education, firm size, years of experience and issue experience. Only issue experience was included in the factor analysis because the other four items are independent of the particular client engagement. Issue experience loaded with the task characteristics, which suggests that the level of specific experience influences the perceptions of the task itself. In summary, the factor analysis results appear to validate (with very few exceptions) Roberts’ characterization of the inputs to a tax professional’s cognitive
38
DONNA D. BOBEK AND RICHARD C. HATFIELD
processing of a judgment and decision-making task. It is interesting to note that the tax professionals viewed client aggressiveness as distinct from other client characteristics. As will be discussed below, when we examine the regression results, client aggressiveness appears to be a particularly salient characteristic. Regression Results The factor scores generated from the factor analysis procedure were used as independent variables, along with gender, firm size, education and years of experience, in a regression with advice aggressiveness as the dependent variable. The results of this regression are reported in Table 5. The regression model was significant at a 0.000 significance level. The model R2 was 0.338 (adjusted R2 was 0.259). Four of the five factors were significant at explaining advice aggressiveness. Only the factor representing risks and rewards associated with the firm was not significant (p-value = 0.219). None of the separately considered individual characteristics (i.e. gender, education, firm size, and years of experience) were significant in explaining advice aggressiveness.7 Examination of the Table 5. Regression Results: Advice Aggressiveness. Independent Variables
Intercept
Parameter Estimates
Standardized Coefficients
4.252
p-Value
0.000
Factorsa Client characteristics Task characteristics IRS risks and rewards Firm risks and rewards Client aggressiveness
0.269 0.282 −0.220 0.149 0.433
0.231 0.250 −0.189 0.126 0.377
0.020 0.011 0.055 0.219 0.000
Other variables Gender Firm size Years of experience Education
0.097 0.003 0.001 −0.190
0.037 0.015 0.004 −0.090
0.729 0.884 0.971 0.384
Model statistics Model mean square F-statistic Model P-value Model R2 Model adjusted R2
4.300 4.259 0.000 0.338 0.259
Note: Dependent variable: advice aggressiveness. a Table 4 reports complete factor loadings.
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
39
variance inflation factors (VIF) indicates that multicollinearity is not affecting these results. The standardized coefficients reported in Table 5 indicate that client aggressiveness was the most important influence, followed by task characteristics, client characteristics, and, finally, risks and rewards associated with the IRS. All of the factors positively affected advice aggressiveness, except, of course, for risks and rewards associated with the IRS, which had a negative coefficient. Based on the variable coding, the direction of these coefficients implies that large, important, financially strong, and aggressive clients received more aggressive advice. Also, the more ambiguous and complex the issue and the larger the dollar amount, the more aggressive was the advice. However, the more concerned the tax professional was about IRS involvement via audit, or penalties, the less aggressive was the advice provided. Discussion It was somewhat surprising that the factor representing concern for client loss and professional liability8 was not significant. One possible explanation is that the CPAs in our sample are not particularly concerned about either of these “risks.” Table 3 gives some support to this idea. Concern for client loss and concern for professional liability were both measured on a 10-point scale with 1 indicating that the respondent didn’t even consider the outcome when formulating his/her advice and 10 indicating that the respondent was very concerned with the outcome. The mean response for client loss was only 3.11, and more than half of the respondents rated it at a “1.” The mean response for professional liability was only 4.31. Therefore, for both of these “risks,” the mean response did not even reach the midpoint of the scale. This lack of concern may be a result of the tax professionals in our sample. Our sample was made up almost exclusively of CPAs from small firms. Cox and Radtke (2000) found that Big 5 CPAs feel more pressure from their firms than CPAs from smaller firms. Therefore, it may be that if Big 5 CPAs were included in the sample, the risks and rewards associated with the firm would have influenced advice aggressiveness. In addition, the mean fee for the engagement in our sample is $1,085. It may be that loss of such a relatively small fee does not represent much risk to the tax preparer.9 In summary, the regression results are generally consistent with the model proposed by Roberts. Client aggressiveness had the largest influence on the aggressiveness of the tax professionals’ advice. However, task characteristics, risks and rewards associated with the IRS, and other client characteristics were also significantly related to advice aggressiveness. The only surprising “non” result was that neither concern for client loss nor professional liability appeared to significantly influence the aggressiveness of the professionals’ advice.
40
DONNA D. BOBEK AND RICHARD C. HATFIELD
Test of Fee Determinants Regression Results To investigate whether tax professionals receive a premium for providing aggressive advice, we asked respondents to estimate the fee they charged as well as the amount of time they spent on the engagement. As reported in Table 3, the mean fee was $1,085, and the average number of hours spent was 7.66 by the tax professional, and 2.32 by others in the firm. As was expected, time spent and fees were highly correlated.10 We regressed fee on time spent, advice aggressiveness, and four other variables that may be related to fee: gender (Ashton, 2000), years of experience, size of firm (measured as number of professionals in the office), and education. The regression results are reported in Table 6. Table 6 reports a high model R2 of 0.89 (adjusted R2 of 0.88), primarily due to the inclusion in the regression of a measure of time spent. Advice aggressiveness was not significant (p-value = 0.879). However, firm size and gender were significantly related to fees. The coefficient for firm size had a t-statistic that was significant at the 0.063 level (two-sided test). The larger the firm, the higher were the fees. Gender was significant at the 0.014 level, and after time spent, was the most influential variable (based on the standardized regression coefficient). Males charged significantly more than females. Years of experience (p-value = 0.116) and education (p-value = 0.214) were not significantly related to the fee charged. Table 6. Regression Results: Fee Charged for Engagement. Independent Variables
Parameter Estimates
Dependent variable: Fee charged Intercept Time spent by self Time spent by others Advice aggressiveness Gender Firm size Years of experience Education
−414.66 116.99 46.76 9.15 442.10 23.48 −16.13 162.54
Model statistics Model mean square F-statistic Model P-value Model R2 Model adjusted R2
30,716,477 84.62 0.000 0.890 0.880
Standardized Coefficients
0.862 0.119 0.006 0.108 0.080 −0.071 0.052
p-Value
0.299 0.000 0.014 0.879 0.014 0.063 0.116 0.214
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
41
Examination of the VIF’s indicates that multicollinearity is not affecting these results. Discussion As expected, fees were strongly related to the amount of time devoted to the engagement. However, we found no direct evidence that aggressiveness is priced. Aggressiveness could be priced indirectly, if aggressive advice leads to more time spent on the engagement. This could occur either because the tax professional “charges” for more time, or because they actually spend more time researching the issue in order to gain comfort with providing aggressive advice. A significant correlation between time spent and advice aggressiveness would be consistent with this notion. However, the Pearson correlation coefficient between time spent and advice aggressiveness is only 0.134, which is not significant (p-value = 0.202). Thus, we have no evidence, either direct or indirect, that tax professionals charge a premium for aggressive advice. The most interesting result regarding the determinants of fees, however, is the significant gender effect. Since gender is an indicator variable, the coefficient on gender can be interpreted as a dollar amount. As shown in Table 6, even after controlling for time spent, firm size, experience and education, females, on average charged $442 less than males.11 This result is consistent in magnitude to the findings in Ashton (2000) who reported that males charged a 57% premium compared with females. In a study focusing on the profitability of small CPA firms, Fasci and Valdez (1998) found that female-owned CPA firms were significantly less profitable than male-owned CPA firms, even after controlling for age of the business, education, experience, time spent on the business and motivation for owning the business. While we cannot explain why females appear to undervalue their work product, evidence continues to indicate that they do. Fasci and Valdez (1998) identify several “disadvantages” faced by women that may result in lower productivity including socialization practices, family roles and lack of networks or contacts. The extent to which any or all of these factors influence the amount of fees female tax professionals charge remains unknown. Future research is necessary to understand this result.
Limitations The advantages of using the structured questionnaire approach, such as the collection of a variety of variables and the focus on actual tax planning engagements, do not come without cost. For example, asking CPAs to select the engagement on which to base their responses may have introduced bias (Gibbins et al., 2001),
42
DONNA D. BOBEK AND RICHARD C. HATFIELD
although we did ask them to recall their last tax planning engagement. There is also the possibility that the subjects’ memories of the factors that influenced them are not correct, or that important influences were not included in the questionnaire. Use of factor analysis mitigates some of the potential multicollinearity concerns. Non-response bias is also a concern. An analysis of late respondents was performed to investigate this concern. The results were qualitatively similar to those reported here when only the late respondents were considered. However, since the number of observations was lower the significance of the variables (particularly the task characteristics factor) was somewhat lower. In addition, this study is limited by the number and homogeneity of participants. While the results do not seem affected by a lack of power (with the possible exception of firm risks and rewards in the main regression), the sample size (93) is somewhat low for a regression with this number of variables. Removal of nonsignificant variables does not improve the significance levels of the other variables. Also, since the CPAs sampled were from one area of the country, generalization of the results to other geographical areas must be done with caution. Finally, our test of the relationship between advice aggressiveness and fees may have been underpowered given the possible lack of complexity (room to be aggressive) in the engagements in our sample.
SUMMARY AND FUTURE RESEARCH This study had two specific objectives for extending prior research. The first objective was to provide a comprehensive test of the validity of Roberts’ (1998) tax professional judgment and decision-making model. A second objective was to determine whether tax professionals charge a premium for aggressive advice. Regarding the first objective, we performed a two-step analysis. The first step consisted of a factor analysis of many of the inputs to tax professionals’ cognitive processing regarding a tax judgment and decision making task. The second step consisted of regression analysis with advice aggressiveness as the dependent variable, and the factor scores from the first step, along with four other individual-psychological factors as independent variables. The factor analysis results revealed that, consistent with Roberts’ (1998) model, the variables loaded nicely on factors that represented client concerns, IRS concerns, firm concerns and task characteristics with two exceptions. Tax dollars at stake, a variable that Roberts identified as a risk/reward associated with the client, loaded with task characteristics. Also, issue experience, which Roberts viewed as an individual-psychological factor, appeared to be related to how the tax professional viewed the task.
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
43
The regression analysis with advice aggressiveness as the dependent variable and the factor scores as independent variables, along with four individualpsychological factors, namely years of experience, education, gender and firm size, revealed that it was the client concerns (for example, client aggressiveness, client size), task characteristics (for example, law ambiguity), and risks and rewards associated with the IRS that influenced advice aggressiveness. Client aggressiveness appeared to be particularly influential. It loaded as a separate factor and had the largest standardized regression coefficient. The only factor that was not significant was the factor that represented risks and rewards associated with the tax professionals’ firm. The two variables that loaded on this factor were concern about professional liability and concern for client loss. We offer two possible reasons for this lack of effect. First, these concerns may be less of an issue with CPAs from small firms, than they would be with Big 5 CPAs. This is consistent with the results of Cox and Radke (1998). Therefore, additional research regarding these variables should be carried out with a population of Big 5 CPAs. Second, concerns about professional liability and concern for client loss may affect CPAs at other points in the judgment and decision-making process. For example, these concerns may cause them to spend more time performing their information search or analyzing alternatives. Roberts (1998) called for additional research in general on the cognitive processing by tax professionals. We echo this advice. Regarding the second objective, we performed regression analysis with fees as the dependent variable, and time spent on the engagement, advice aggressiveness, firm size, gender, education and years of experience as explanatory variables. Time spent was highly significant and was, by far, the biggest determinant of fees. Advice aggressiveness was not related to fees, or even to the amount of time spent. Therefore, we tentatively conclude that tax professionals, at least from small firms, do not charge a premium for aggressive advice. A possible explanation could be that tax professionals spread risk among all of their clients by incorporating a risk factor into their hourly billing rates.12 This particular explanation, as well as the general relationship between fees and advice, warrants additional research. The variables firm size and gender were significantly related to fees, with larger firms and males charging more. The gender result is of particular concern. Even after controlling for education, time spent, years of experience and firm size, it appears that males charge significantly more than females. This result is consistent with prior research results reported in Ashton (2000) and Fasci and Valdez (1998). None of these studies, including ours, provides an adequate explanation for why females appear to undervalue (relative to males) their work product. We urge future researchers to investigate the cause of this undervaluing so that prescriptive advice can be provided to female accounting professionals.
44
DONNA D. BOBEK AND RICHARD C. HATFIELD
NOTES 1. This figure was taken from an earlier version of Roberts (1998). 2. Ashton’s results are inconclusive for at least two reasons. First, his results were significant at the 0.10 level. Second, he defined aggressive in terms of the direction and magnitude of error from a predetermined “correct” tax due. Thus, females may have been more aggressive, or may have just been wrong more often in a tax-decreasing direction. 3. Roberts’ individual-psychological category included both characteristics of the tax professional (e.g. experience, knowledge, professional status) and psychological attributes of the tax professional (e.g. advocacy, aggressiveness). In this study, we focus on tax professional characteristics as opposed to psychological attributes and thus from this point forward we will describe these variables as “individual” factors instead of individual-psychological factors. 4. All but six of the subjects reported that they worked in a firm that was “not Regional/National.” Of those six, three did not respond to the question, one was from a Big-5 firm, one from a regional firm and one from a national firm. When those six subjects were eliminated from the analyses, the reported results did not change. 5. The average firm focused on tax work has about four licensed professionals and the median firm has just one or two professionals (Russell, 2002). 6. The loading of tax dollars at stake with client characteristics was 0.321, compared to 0.628 with task characteristics. 7. When all of the individual factors (gender, firm size, years of experience and education) were removed, the only significant change in the results is that the factor representing firm risks and rewards became marginally significant (p = 0.07). 8. We also collected data regarding whether or not the tax professional had previous experience with a malpractice claim. Inclusion of this variable does not change the reported results. 9. We also considered that, while concern about professional liability and client loss were positively correlated with each other, they might have differing effects on advice aggressiveness. However, inspection of the correlation coefficients between each variable and advice aggressiveness indicated that neither was significantly correlated with the dependent variable and replacing the Factor 4 score from the model in Table 5 with the individual variables did not produce significant results for either variable. Finally, the effect of client loss concerns on advice aggressiveness might differ depending upon how aggressive the client was. In other words, concern for client loss should only increase advice aggressiveness when the client was also aggressive. Thus we added an interaction term between Factors 4 and 5 to the model in Table 5. This interaction was not significant. However, since the interaction effect does not necessarily relate to professional liability, we considered a client loss X client aggressiveness interaction separately by dichotomizing the two variables at the median. This analysis also did not produce a significant interaction effect between concern for client loss and client aggressiveness. 10. The Pearson correlation coefficients between fee and time spent is 0.841 and fee and time spent by others is 0.787. 11. We also collected the tax professional’s position in the firm (e.g. partner, manager, etc). Position was correlated with gender; however, when it was included in the regression it was not significant (and did not cause multicollinearity problems). It did, however, reduce the coefficient on gender to 365 (which was still significant).
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
45
12. Another possible explanation is that the client assumes the risk for aggressive advice, and thus the tax professional does not find it necessary to charge more for this type of advice. This may be the case if the tax professional adequately informs his/her client of the risks associated with taking an aggressive position and lets the client decide whether or not to take the position.
ACKNOWLEDGMENTS We appreciate helpful comments from Dale Bandy, Peggy Dwyer, Andrew Judd, Lois Mahoney, Robin Roberts, participants at a University of Central Florida research workshop, and two anonymous reviewers. The first author is grateful to the PriceWaterhouseCoopers Foundation for financial assistance.
REFERENCES Anderson, S., & Wolfe, J. (2001). Accountants’ liability: Where are claims coming from? The Ohio CPA Journal, 60(4), 21–24. Ashton, R. H. (2000). Accuracy, agreement, and aggressiveness in tax reporting: Evidence from the money magazine contests. Advances in Taxation, 12, 1–21. Ayers, F. L., Jackson, B. R., & Hite, P. A. (1989). The economic benefits of regulation: Evidence from professional tax preparers. The Accounting Review, 64(2), 78–87. Bandy, D. (1996). Limiting tax practice liability. The CPA Journal, 66(5), 46–50. Bandy, D., Betancourt, L., & Kelliher, C. (1994). An empirical study of the objectivity of CPAs’ tax work. Advances in Taxation, 6, 1–23. Behn, B. K., Carcello, J. V., Hermanson, D. R., & Hermanson, R. H. (1999). Client satisfaction and big 6 audit fees. Contemporary Accounting Research, 16(4), 587–608. Carnes, G. A., Harwood, G. B., & Sawyers, R. B. (1996). The determinants of tax professionals’ aggressiveness in ambiguous situations. Advances in Taxation, 8, 1–26. Christensen, A. L. (1992). Evaluation of tax services: A client and preparer perspective. Journal of the American Taxation Association, 14(Fall), 60–87. Cloyd, C. B. (1995). Prior knowledge, information search behaviors, and performance in a tax research task. The Journal of the American Taxation Association, 17(Suppl.), 82–107. Cloyd, C. B., & Spilker, B. C. (1999). The influence of client preferences on tax professionals’ search for judicial precedents, subsequent judgments and recommendations. The Accounting Review, 74(3), 299–322. Cox, S. R., & Radtke, R. R. (2000). The effects of multiple accountability pressures on tax return preparation decisions. Advances in Taxation, 12, 23–50. Cuccia, A. D. (1994). The effects of increased sanctions on paid preparers: Integrating economic and psychological factors. The Journal of the American Taxation Association, 16(1), 41–66. Cuccia, A., Hackenbrack, K., & Nelson, M. W. (1995). The ability of professional standards to mitigate aggressive reporting. The Accounting Review, 70(2), 227–248. Department of the Treasury (1994). Treasury Department Circular 230. 31 CFR, subtitle A, sections 10.0–10.98 and 10.100–10.101. Washington, DC: Department of the Treasury.
46
DONNA D. BOBEK AND RICHARD C. HATFIELD
Duncan, W. A., LaRue, D. W., & Reckers, P. M. J. (1989). An empirical examination of the influence of selected economic and noneconomic variables in decision making by tax professionals. Advances in Taxation, 2, 91–106. Fasci, M. A., & Valdez, J. (1998). A performance contrast of male- and female-owned small accounting practices. Journal of Small Business Management, 36(3), 1–7. Francis, J. R., & Simon, D. T. (1987). A test of audit pricing in the small-client segment of the U.S. audit market. The Accounting Review, 62(1), 145–157. Frischmann, P. J., & Frees, E. W. (1999). Demand for services: Determinants of tax preparation fees. Journal of the American Taxation Association, 21(Suppl.), 1–23. Gibbins, M., Salterio, S., & Webb, A. (2001). Evidence about auditor-client management negotiation concerning client’s financial reporting. Journal of Accounting Research, 39(3), 535–563. Helleloid, R. T. (1989). Ambiguity and the evaluation of client documentation by tax professionals. The Journal of the American Taxation Association, 11, 22–36. Hite, P. A. (1992). An examination of taxpayer preference for aggressive tax advice. National Tax Journal, 45(4), 389–403. Kaplan, S., Reckers, P. M. J., West, S., & Boyd, J. (1988). An examination of tax reporting recommendations of professional tax preparers. Journal of Economic Psychology, 9(4), 427–443. Klepper, S., & Nagin, D. (1989). The role of tax preparers in tax compliance. Policy Sciences, 22, 167–194. LaRue, D., & Reckers, P. M. J. (1989). An empirical examination of the influence of selected factors on professional tax preparers’ decision process. Advances in Accounting, 7, 37–50. McGill, G. A. (1990). The CPA’s aggressive position recommendation decision: Situational, attitudinal, and personality factors. Working Paper, University of Florida, Gainesville, Florida. Newberry, K. J., Reckers, P. M. J., & Wyndelts, R. W. (1993). An examination of tax practitioner decisions: The role of preparer sanctions and framing effects associated with client condition. The Journal of Economic Psychology, 11(1), 119–146. O’Keefe, T. B., King, R. D., & Gaver, K. M. (1994). Audit fees, industry specialization, and compliance with GAAS reporting standards. Auditing: A Journal of Practice and Theory (Fall), 41–54. Pei, B. K. W., Reckers, P. M. J., & Wyndelts, R. W. (1992). Tax professionals belief revision: The effects of information presentation sequence, client preference, and domain experience. Decision Sciences, 23(1), 175–199. Phillips, J., & Sansing, R. C. (1998). Contingent fees and tax compliance. The Accounting Review, 73(1), 1–18. Reckers, P. M. J., Sanders, D. L., & Wyndelts, R. W. (1991). An empirical investigation of factors influencing tax practitioner compliance. The Journal of the American Taxation Association, 13(2), 30–46. Roberts, M. L. (1998). Tax accountants’ judgment/decision-making research: A review and synthesis. The Journal of the American Taxation Association, 20(1), 78–121. Roberts, M. L., & Cargile, B. R. (1994). Impartiality vs. advocacy: CPA’s responses to conflict in auditing and tax situations. Working Paper, University of Alabama, Tuscaloosa, Alabama. Roberts, M. L., & Klersey, G. F. (1996). Effects of authoritative guidelines and experience on tax decision making. Working Paper, University of Alabama, Tuscaloosa, Alabama. Russell, R. (2002). Independent practitioners make over half their earnings from tax preparation. Accounting Today, 16(Fall), 6–7. Schisler, D. L. (1994). An experimental examination of factors affecting tax preparers’ aggressiveness – A prospect theory approach. The Journal of the American Taxation Association, 16(2), 124–142.
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
47
Schisler, D. L. (1995). Equity, aggressiveness, consensus: A comparison of taxpayers and tax preparers. Accounting Horizons, 9(4), 76–87. Simunic, D. A. (1980). The pricing of audit services: Theory and evidence. Journal of Accounting Research, 18(Spring), 161–190. Spilker, B. C., Worsham, R. G., & Prawitt, D. F. (1999). Tax professionals’ interpretations of ambiguity in compliance and planning-decision contexts. Journal of the American Taxation Association, 21(2), 75–89. Wladis, R. (1995). Professional liability survey. Pennsylvania CPA Journal, 66(5), 30–33. Yancey, W. F. (1996). Managing a tax practice to avoid malpractice claims: Learning from past disasters. The CPA Journal, 66(2), 12–18.
APPENDIX: TAX ADVISOR QUESTIONNAIRE AVERAGE RESPONSES Please recall the most recent issue on which you provided tax planning advice to one of your clients. Please answer the following questions with that client and issue in mind. You will never be asked to reveal yourself or your client (circle a number to indicate your response to each question). (1) How complicated was the tax issue for this specific client, relative to other tax issues which you have advised this or other clients? Very simple 1
Average 2
3
4 Average: 4.84
5
6
Very complicated 7
(2) How clear was the authority regarding this tax issue, relative to other tax issues which you have advised this or other clients? Very Clear 1
2
Average 4 Average: 3.58
3
5
6
Very Ambiguous 7
(3) How much experience do you have with this tax issue, relative to other tax issues which you have advised this or other clients? Less than Average 1
2
3
Average 4 Average: 4.83
5
6
More than Average 7
48
DONNA D. BOBEK AND RICHARD C. HATFIELD
(4) Was the dollar amount of tax savings at stake for this issue large or small, relative to other tax issues you have advised clients? Very Small 1
2
3
Average 4 Average: 4.6
5
6
Very Large 7
(5) Was the advice given about this issue covered by an engagement letter? Yes
No Yes: 15.0%
(6) How large (e.g. total revenues, total assets, etc.) is this client, relative to your other clients? Very Small 1
2
3
Average 4 Average: 4.22
5
6
Very Large 7
(7) How important is this client, relative to other clients (e.g. total billing, other accounting services provided, referrals, etc.)? Not Important 1
2
3
Average 4 Average: 4.22
5
Very Important 7
6
(8) How financially strong (e.g. solvency, net worth, etc.) is this client, relative to other clients? Very Weak 1
2
3
Average 4 Average: 5.28
5
6
Very Strong 7
(9) How would you rate your working relationship with this client, relative to other clients? Well Below Average 1
Average 2
3
4 Average: 5.58
5
6
Well Above Average 7
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
49
(10) How aggressive, regarding tax issues, would you say this client is, relative to other clients? That is, does this client prefer aggressive or conservative advice? Very Conservative 1
Average 2
3
4 Average: 4.49
Very Aggressive 6
5
7
(11) How aggressive was the advice that you gave the client on this specific issue? Very Conservative 1
Average 2
3
4 Average: 4.14
5
Very Aggressive 7
6
(12) How certain are you that the advice you gave your client would hold up in court if challenged by the IRS? Very Certain 1
Average 2
3
4 Average: 2.71
5
Very Uncertain 7
6
(13) If the planned transaction was made by the client, challenged by the IRS, and the client lost, how likely would it be that the client would bring suit against you or your firm? Very Unlikely 1
Not Sure 2
3
4 Average: 2.56
5
6
Very Likely 7
(14) Below are five negative outcomes that can result from providing tax planning advice. These outcomes may constrain the tax professional when providing advice to a client. Please rate the extent to which these outcomes influenced the advice you provided your client on this issue. Please enter a number between 1 and 10 below (with 10 indicating that you were very concerned with the specific outcome and 1 indicating that you didn’t even consider the outcome when formulating your advice).
50
DONNA D. BOBEK AND RICHARD C. HATFIELD
Average 4.68 4.31 3.74 2.95 3.11
Taxpayer Penalties Professional Liability (i.e. client sues firm for failed tax advice) IRS Audit of Client Preparer Penalties Loss of Client
To help us categorize your responses, please answer some demographic questions. (1) How many years experience do you have as a tax accountant 19.75 years (2) What position do you hold in your firm?% responding: Staff 0% Manager 7.5% (3) What is your highest degree achieved? High School Associate Degree Bachelors Degree (4) What is your gender?
0% 6.5% 56%
Senior Partner/Owner
Masters Degree PhD
2% 88%
35.5% 1%
Male 72% Female 28% (5) What % of your chargeable time is spent providing tax advice that requires some amount of research 19% (6) Are you a(n): CPA 83.7% Enrolled Agent 8.7% Attorney 0% Other 7.6% (7) What size CPA firm do you work for? Big 5 National Firm Regional Firm Not Regional/National
1.1% 1.1% 1.1% 96.7%
(How many professionals in office? 5.46) (8) Please estimate how many hours were spent on this issue for your client by: Yourself Others
7 2/3 hours 2 1/3 hours
Determinants of Tax Professionals’ Advice Aggressiveness and Fees
51
(9) Please estimate the amount of the fee you charged this client which was allocated to advice on this issue (enter dollar amount)? $1,085 (10) In general, how big of a concern is professional liability to you in your tax planning activities? Not a Concern at All 1
2
3
Somewhat of a Concern 4 Average: 4.6
5
6
Very Much of a Concern 7
BEHAVIORAL IMPLICATIONS OF ALTERNATIVE GOING CONCERN REPORTING FORMATS Chantal Viger, Asokan Anandarajan, Anthony P. Curatola and Walid Ben-Amar ABSTRACT The generally accepted method of presentation with respect to goingconcern reporting in a global context is to modify the auditor’s report with an explanatory paragraph in addition to having a separate note to the financial statements. In Canada, however, the auditor’s report is clean, and the going concern uncertainty is restricted to the endnotes. This research, using Canadian students as subjects and conducted as a between-subjects experiment, examines unsophisticated investor’s behavior to the signal conveyed by different reporting formats by auditors (U.S. versus Canadian). The results indicate that the form of the auditor’s report does significantly influence subjects’ decisions to invest and their perception of risk.
INTRODUCTION A number of studies have examined the information content of different financial statement formats. Some studies examined the influence of words vs. numbers and graphics (Frownfelter & Fulkerson, 1998; Stocks & Tuttle, 1998). While
Advances in Accounting Behavioral Research Advances in Accounting Behavioral Research, Volume 7, 53–73 Copyright © 2004 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1474-7979/doi:10.1016/S1474-7979(04)07003-6
53
54
CHANTAL VIGER ET AL.
the general findings of these studies indicated that numbers and graphics convey information more clearly than words, the reality is that the auditor’s report, which is a frequently analyzed form of report, is stated in words. Hence, the presentation of information by management of the company and the auditor’s attestation potentially influences the financial statement users. The importance of the auditor’s report may take on greater meaning when the client faces financial distress that could threaten its going concern status. This issue has been recognized and addressed in most industrialized countries by means of the auditor’s report, which is modified with an explanatory paragraph (also referred to as “emphasis of matter”) that details the going concern uncertainty. Much research, therefore, has focused on the informational content of the going concern report and whether this report significantly influences the decision-making behavior of lenders and investors. The preponderant view is that descriptions of the going concern contingency in the endnotes to the financial statements suffice to warn the reader of potential problems (e.g. Elias & Johnston, 2001; Libby, 1979; Pringle et al., 1990). Although the general consensus is that the explanatory paragraph in the auditor’s report should not significantly influence users’ behavior (Elias & Johnston, 2001), it is required in the United States as well as other industrialized nations. In Canada, however, Section 5150 of the Canadian Institute of Chartered Accountants (CICA) Handbook does not require the auditor to modify his/her auditor’s report if the going concern contingency is described as a note in the financial statements (CICA, 2003). As a result, the auditor’s report is clean and the financial statement user has to read the disclosure of financial distress in the financial statements with no acknowledgement of said issue by the auditors. Canada issued exposure drafts in 1995–1996 that sought to change this reporting position. Under these exposure drafts (CICA, 1995, 1996), the auditor’s report would remain unqualified with no reference to a going concern uncertainty; however, the going concern contingency would be highlighted on the face of the Balance Sheet and Income Statement (in addition to a separate note clearly labeled as going concern assumptions). The CICA assumed that “commonality” or redundancy in the form of repetition would sufficiently accentuate the signal to financial statement users. While these exposure drafts were rescinded in 1999, the Accounting Standards Board of the CICA is revisiting the need to “minimize the differences between Canadian and United States’ GAAP in reporting the going concern contingency” (CICA, 1999, p. 2). Even if the exposure drafts had been adopted, a difference would still have existed between the Canadian and United States’ audit report in the presence of going concern uncertainties. More specifically, the United States’ method of presentation for a going concern uncertainty requires auditors to modify the auditor’s report with an explanatory paragraph in addition to management’s
Behavioral Implications of Alternative Going Concern Reporting Formats
55
separate note to the financial statements.1 The explanatory paragraph describes the events that cast doubt about the entity’s ability to remain as a going concern. The auditor’s report is technically referred to as an unqualified modified report because the explanatory paragraph serves as a “red flag” to financial statement users. Other countries have adopted a similar view to going-concern reporting requirements. In fact, the International Federation of Accountants (IFAC) requires auditors to modify their report by adding “an emphasis of matter paragraph” that highlights the going concern problem (IFAC, 1999). Thus, the Auditing Standards Board (ASB) of the United States and the IFAC positions are in stark contrast to the CICA, which posits that a note in the financial statements alone is a sufficient warning to investors. The contribution of this research is twofold. First, the results have implications for standard setters in Canada because of doubts cast on the adequacy of the current reporting requirements for a going concern issue. As mentioned above, Canada differs from the United States and other Western nations in that the auditor’s report in the presence of going concern uncertainties is “clean” rather than modified. Since this topic is a work in process by the Canadian Auditing Standards Board, the ongoing discussion entails whether the Standards Board: (a) should maintain the present method (a method criticized by Boritz (1991) as being too passive and sending a “mute” signal); (b) adopt an in-between position, for example still keep the report clean, but highlight the going concern uncertainty on the face of the Balance Sheet and Income Statement while referencing the going concern contingency footnote (the criticism above could hold here too, though to a lesser degree); or (c) adopt the approach used by the United States and other Western nations (that is, modify the audit report with a fourth explanatory paragraph detailing the going concern uncertainty and referencing the appropriate footnote). While this topic has been temporarily “shelved,” any research in this area may provide some insight to the Standards Board in their eventual deliberations. Recently, Anandarajan et al. (2002) examined this issue with respect to Canadian loan officers, clearly a sophisticated financial statement user group. This research expands on their findings by considering a less sophisticated financial statement user group. This study also seeks to contribute to the extant literature on the impact presentation formats have on individuals’ judgment. We examine whether various methods of presentation differentially affect the extent to which non-professional investors incorporate going concern information in investment decision judgments. Another reason for selecting non-professional investors is that regulators such as the Securities and Exchange Commission in the U.S. (e.g. Levitt, 1997) have
56
CHANTAL VIGER ET AL.
expressed an interest in understanding how financial reporting standards affect this investor group. Maines and McDaniel (2000) note that non-professional investors due to their relatively limited understanding of financial information would be more influenced by presentation formats than professional analysts. Further Maines and McDaniel (2000), and Hunton and McEwen (1997), both of whom used students as surrogates, state that in comparison to analysts, non-professional investors: (a) generally have ill-defined valuation models; (b) fail to identify specific data needed for financial analysis; and (c) assimilate information in a relatively unstructured manner. They also note that non-professional investors read the financial statements in the order presented, suggesting that they have few preconceived ideas of the importance of and/or relations among various financial statement items. Given this sequential information processing, non-professional investors are likely to consider all information regardless of its location.
THEORY DEVELOPMENT AND HYPOTHESES FORMULATION In the accounting literature, the organization of information has been shown to affect auditor’s going concern judgments (Ricchiute, 1992). While some studies warn that increasing the number of cues often “overloads” decision makers, leading to judgments of lower quality (Chewning & Harrell, 1990; Iselin, 1993), most researchers posit that if there is no information overload, then repetition and/or commonality influences judgment. Tversky and Kahneman (1973) indicate that availability of information, and, by extension, multiple redundancies result in greater understanding of the message conveyed. Slovic and MacPhillamy (1974) state that decision makers place greater weight on common measures and, unconsciously, at measures that are repeated. Slovic and MacPhillamy demonstrate that when two alternatives have a common attribute, along with unique attributes, the common attribute is weighted more. Payne et al. (1993) theorize that people choose simplifying strategies when making decisions. They note that reliance on common attributes or an attribute that is repeated is one such simplifying strategy; and more importantly, this form of decision making is not deliberate, but done subconsciously. These findings, especially the conclusions from cognitive and judgment research indicate that multiple reinforcements of the going concern contingency would accentuate the signal and influence decision-making. Further, the placement of particular items within the financial statements has also been shown to affect users’ judgments (though not in the context of going concern reporting). Hopkins (1996), for example, indicates that the location or placement of securities had an effect on financial analysts’ stock price judgments.
Behavioral Implications of Alternative Going Concern Reporting Formats
57
Hirst and Hopkins (1998) show that presenting comprehensive income in the Income Statement affects financial analysts’ stock price judgments differently than presenting the information in the Statement of Changes in Equity. The conclusion is that specific placement of particular pieces of information affected judgment and use of the information by financial statement users. In this context, an upfront reporting of the going concern uncertainty in the form of an explanatory (emphasis of matter) paragraph should accentuate the signal relative to a more subtle form of reporting, such as merely referencing on the face of the Income Statement and Balance Sheet and not reporting upfront on the auditor’s report. In this section we present a framework for evaluating how different formats for presenting going concern information affect investors’ judgments. This framework proposes that greater degrees of redundancy with reference to the presentation of going concern information may serve to focus readers’ attention more clearly on the matters raised. Based on the theory generated by the research cited above, we conclude that financial statement information incorporating varying levels of redundancy can influence investment decisions. Higher levels of redundancy accentuate the signal conveyed by the message. Hirst and Hopkins (1998) note that presentation format may affect analysts’ judgments partly because of the failure to sufficiently record information in memory. This shortcoming can be rectified by redundancy in information presentation. Similarly, Lipe and Salterio (2000) provide evidence that incorporating reinforcement and providing direct links between information may help decision makers mentally “chunk” these items and thus increase the emphasis on these items in forming judgments. In this research the “items” represent the going concern information. The “direct links” are the referencing of the going concern uncertainty in the explanatory (emphasis of matter) paragraph of the modified auditor’s report (United States format) and the referencing of the going concern uncertainty on the face of the Balance Sheet and Income Statement (proposed Canadian exposure draft). In Fig. 1, information acquisition is interpreted using the definition of Maines and McDaniel (2000, p. 183) as “an investor reading a specific financial statement item and storing the item in memory sufficiently well to recall where it appeared in the financial statements.” In the case of the control group, there is no information acquisition since the going concern uncertainty is neither discussed in a standalone note in the financial statements (the going concern uncertainty is disclosed like any other contingency) nor discussed in the explanatory paragraph. In the case of the format proposed by the Canadian exposure draft, there is information acquisition (as the going concern uncertainty is highlighted as a stand-alone note in the financial statements and referenced on the face of the Balance Sheet and Income Statement); and, as a result, an evaluation and a weighting given to this
58 CHANTAL VIGER ET AL.
Fig. 1. Framework for Examining Effects of Different Formats of Going Concern Information on Subjects’ Risk Assessment.
Behavioral Implications of Alternative Going Concern Reporting Formats
59
realization when evaluating overall investment risk. Similarly, as shown in the figure, the United States format should also result in information acquisition and evaluation (as the going concern uncertainty is discussed in a stand-alone note in the financial statements and discussed in an explanatory paragraph in the auditor’s report). This paper postulates that information evaluation and hence “weighting” will be greater in the presence of a modified report with an explanatory (emphasis of matter) paragraph upfront detailing and referencing a going concern uncertainty relative to the formatting in the proposed Canadian exposure drafts. Similarly, “weighting” of the going concern contingency would be greater for the proposed draft mode of format (unqualified report with no upfront reference but going concern contingency highlighted on the face of the Balance Sheet and Income Statement referencing a stand alone footnote) relative to a situation where there is no signal whatsoever as to a contingency. Based on the above, the following hypothesis (stated in the alternative form) is tested: H1 . Investors’ decisions to invest will be significantly lower when the reference to the contingency is in the form of a modified report relative to an unmodified report in the presence of going concern uncertainties, given full disclosure in the notes to the financial statements. The literature reveals that investors’ perceptions can be measured using certain criteria (Bertholdt, 1979; Gul, 1987; LaSalle & Anandarajan, 1997). These criteria are discussed below. Gul (1987), for example, notes that increased levels of disclosure about an uncertainty, including a going concern uncertainty, may be expected to increase the estimated effect of the uncertainty on the results and position disclosed in the financial statements and hence on the variance of expected cash flows. Thus, more disclosure may result in a greater perception of risk. Based on this, the second hypothesis is proposed and stated as follows: H2 . Investors’ perceptions of risk will be significantly higher when the reference to the contingency is in the form of a modified report relative to an unmodified report in the presence of going concern uncertainties, given full disclosure in the notes to the financial statements. Anandarajan et al. (2002) and LaSalle and Anandarajan (1997) found that the method of disclosure of the going concern uncertainty in the auditor’s report impacts users’ (loan officers in both studies) perceptions of the likelihood (or lack thereof) of a company improving its profitability. Based on this research, the third hypothesis is proposed and stated as follows:
60
CHANTAL VIGER ET AL.
H3 . Investors’ perceptions that the company can improve its profitability will be significantly lower when the reference to the contingency is in the form of a modified report relative to an unmodified report in the presence of going concern uncertainties, given full disclosure in the notes to the financial statements. Libby (1979) found that uncertainty qualifications cause users to search for additional information in order to estimate the effects of the uncertainty. Bertholdt (1979) and Gul (1987) suggested that there might be an increase in information search by financial statement users with increasing levels of disclosure of the uncertainty. The theory is that redundant information in the form of additional disclosure (e.g. explanatory paragraph in addition to a note in the financial statements) may exacerbate the perception of risk. This perception, in turn, may stimulate user behavior to have a greater need to assess the financial impact of that uncertainty on the company. As a result, the fourth hypothesis is proposed and stated as follows: H4 . Investors’ need for additional information will be significantly greater when the reference to the contingency is in the form of a modified report relative to an unmodified report in the presence of going concern uncertainties, given full disclosure in the notes to the financial statements. Bamber and Stratton (1997) note that an explanatory paragraph on the audit report detailing the uncertainty may have the consequence of focusing a reader’s attention on financial statement elements particularly important for their task. The implication is that the explanatory paragraph will highlight and magnify the impact of any notes the auditor chooses to emphasize. If so, investors should rate a note in the financial statements that is highlighted in the auditor’s report as more important to their investment decision relative to a scenario where the note is not highlighted. Consequently, the following two hypotheses are examined: H5 . Investors will weight the financial statement disclosure in the form of the going concern assumption footnote higher when it is referred to in the modified report format in their determination of factors entering into the investment decision. H6 . Investors will weight the modified audit report higher than the standard report in their determination of factors entering into the investment decision.
RESEARCH METHODOLOGY Research Design The research design was a 3 × 3 × 2 between-subject design, displayed in Fig. 2. This design was selected because subjects in a within-subjects design would have
Behavioral Implications of Alternative Going Concern Reporting Formats
61
Fig. 2. Experimental Design.
insight to the variable being manipulated. Such an insight may have sensitized the subjects with respect to their responses to the questions in the instrument. Subjects were randomly assigned to one of two experimental groups and a control group. Each participant received a case scenario relating to a fictitious company that presented a set of financial statements including a two-year corporate Balance Sheet, Statements of Earnings and Retained Earnings, and Statement of Changes in Financial Position for three years. In addition, notes to the financial statements were provided for the current year only. Participants assigned to Group 1 were provided going concern information based on current United States reporting standards, namely a separate going concern uncertainty note in the financial statements and a modified auditor’s report with a fourth explanatory paragraph detailing the uncertainty. Participants assigned to Group 2 (as shown in Fig. 2) were provided going concern information based on the proposed (now rescinded) Canadian exposure draft, where the going concern uncertainty was separately highlighted in a stand alone note combined with referencing on the face of the
62
Table 1. Differences Between Experimental Groups (1 and 2) and Control Group (3). Group
Decision to invest
Investing in the Company
1. Experimental (groups 1 and 2) 2. Control (group 3)
Yes
No
28 (48.3%) 27 (96.4%)
30 (51.7%) 1 (3.6%)
Mean
Standard Deviation
Statistic ( p-Value)
2 = 18.993 (0.000)
Perception of investment risk
1. Experimental (groups 1 and 2) 2. Control (group 3)
3.59 2.39
0.80 0.63
t = 6.951 (0.000)
Perception of likelihood that the company can improve its profitability
1. Experimental (groups 1 and 2) 2. Control (group 3)
3.24 3.82
0.90 0.77
t = −2.917 (0.005)
Perception of need to search for additional informationa
1. Experimental (groups 1 and 2) 2. Control (group 3)
3.18 2.43
1.31 0.79
t = 79.395 (0.002)
endpoints of the five point scales are as follows:
(1) Investment risk (2) Likelihood that company improve its profitability (3) Search for additional information
Low risk Very unlikely Very unlikely
High risk Very likely Very likely
CHANTAL VIGER ET AL.
a The
Behavioral Implications of Alternative Going Concern Reporting Formats
63
Balance Sheet, Statement of Earnings and Retained Earnings, and an unqualified auditor’s report. The comparisons between groups 1 and 2 were of primary interest to test our hypotheses; however, it was essential to ensure that the numbers in the financial statements did not drive the study results. Consequently, a control group (Group 3) was included in the research design. Subjects assigned to the control group were provided with identical information to that received by subjects of the experimental groups except that the going concern uncertainty was not disclosed separately from other contingencies and no reference was made to the going concern contingency in the financial statements and auditor’s report. A comparison was made between the decisions and perceptions of the two experimental groups (Groups 1 and 2) with those of the control group (Group 3). As shown in Table 1, there were significant differences between the control group and the experimental groups on whether to invest in the company (p-value = 0.0001). While 96% of the respondents in the control group indicated that they would invest in the company, only 48% of the respondents in the experimental groups indicated the same. Similarly, the differences between the control and experimental groups were all significant with respect to their perception of investment risk (p = 0.0000), perception of likelihood that the company can improve profitability (p < 0.005), and perception of the need to search for additional information (p < 0.002). These variations provide preliminary evidence that results are attributable to the reference to the going concern uncertainty rather than the numbers in the financial statements. Sample The subjects selected for this research were master’s level students at the University of Quebec at Montreal. These students were considered appropriate surrogates for non-professional investors because they have limited knowledge of accounting rules (Pringle et al., 1990); and, as such, they are less likely to be familiar with foreign country reporting rules. In contrast, other potential subjects such as Chartered Accountants (CA), CPAs or financial analysts might have been sensitized to changes in the standard auditor’s report; as a result, their decisions may be biased by their expectations of a particular audit report format. In addition, Walters-York and Curatola (1998, 2000) have concluded that the use of experienced students, as in this study, can provide meaningful results. Walters-York and Curatola (2000) note that research relying on student subjects is likely no less valid than research relying on non-student subject groups; student samples provide no greater threat to external validity than typical real-world samples. The customary real-world sample can be placed under the same scrutiny for lack of formal representativeness and atypicality as the customary student sample (p. 258).
64
CHANTAL VIGER ET AL.
The students who participated in this study were resuming their course work requirements to be eligible to sit for the National Chartered Accountants examination while working full-time. The average age of these students was 24.3 years, and they had completed an average of 8.33 accounting courses and almost two auditing courses in their university studies. A total of 86 students participated in the study and were randomly assigned to the three groups (29 in Group 1, 29 in Group 2, and 28 in Group 3). Statistical tests were conducted on the personal characteristics of the students in the three groups; students in the three groups were not statistically different in terms age, number of courses in management and financial accounting, number of auditing courses.
Task The experimental instrument provided to the students was developed from research instruments previously used (Bamber & Stratton, 1997; LaSalle & Anandarajan, 1997). It consisted of a covering letter, descriptive information about a hypothetical company, auditor’s report, Balance Sheet (2 years), Statements of Earnings and Retained Earnings (3 years), Statements of Changes in Financial Position (3 years), notes to financial statements (including the note highlighting the going concern problem for the current year only). All subjects received the above information. The difference among the three groups was the auditor’s report and the manner by which the going concern uncertainty was disclosed (please refer to Fig. 2). Specifically, participants within each experimental group received only one type of auditor’s going concern report, namely, the report with the explanatory paragraph (United States format) for participants assigned to the experimental group 1 or the standard report (Canadian format) for participants assigned to the experimental group 2 and the control group. Participants within the control group received the going concern uncertainty not disclosed separately from other contingencies (and no reference was made to the going concern contingency in the financial statements). For participants in the control group the going concern contingency was integrated with the other footnotes (not highlighted). No reference was made to the going concern uncertainty in the auditor’s report.
Response Instrument The response portion of the experimental instrument was effectively broken into three sections and is shown in the appendix. The first question in the instrument
Behavioral Implications of Alternative Going Concern Reporting Formats
65
sought to obtain background information about the respondents. Question 2 sought the intensity of the respondents to their interest in investing in the company. Questions 3, 4, and 6 aimed to examine respondents’ perceptions of risk.2 Subjects were asked to circle an answer on a Likert scale ranging from 1 to 5. The scales were set up so that a high score for question 3 and 6 indicated a high perception of risk and a low score, a low perception of risk. In contrast, question 4 was set up so that a low score indicated a high perception of risk, and a high score, a low perception of risk. Finally, question 5 (assigned to those in the experimental groups only) requested the subjects to assign 100 points across the different financial statement items according to the items’ relative importance to the investment decision. Question 5 was only assigned to the experimental groups because only those groups received the going concern contingency manipulation. Subjects were requested to assign points based on the perceived importance of each item on the decision making process. The items of information were presented in the same order for participants of groups 1 and 2. Once the experiment was complete appropriate manipulation checks were conducted to ensure that the subjects fully understood the meaning of the questions that were asked.3
RESULTS The first hypothesis (H1 ) related to the investment decision. A Chi-square test was performed to assess whether the subject’s willingness to invest was affected by the difference in presentation in the auditor’s report. The results, as shown in Table 2, indicate that the difference between the two experimental groups is statistically significant (2 = 9.943; p-value = 0.002). More specifically, only eight subjects (28.6%) who received the unqualified modified report were willing to invest while twenty subjects (71.4%) who received the standard report were willing to invest. Overall, the results of H1 suggest that the explanatory
Table 2. Cross-Tabulation of Audit Report and Investment Decision. Investment Decision
Experimental Groups Group 1 Actual U.S.
Total
Group 2 Proposed Cd
Yes No
8 (28.6%) 21 (70%)
20 (71.4%) 9 (30%)
28 30
Total
29
29
58
Note: 2 = 9.943, p-value = 0.002.
66
CHANTAL VIGER ET AL.
Table 3. Tests of Differences in Perceptions of Investors. Group
Mean
Standard Deviation
z-Statistic (p-Value) One Tail Test
Investment risk
Standard Modified
3.31 3.86
0.71 0.79
−2.64 (0.004)
Likelihood that the company improve its profitability
Standard Modified
3.52 2.97
0.78 0.94
−2.15 (0.015)
Search for additional information
Standard Modified
2.93 3.41
1.36 1.24
−1.37 (0.084)
Note: The endpoints of the five point scales are as follows: Investment risk Likelihood that company improve its profitability Search for additional information
Low risk Very unlikely Very unlikely
High risk Very likely Very likely
paragraph included in the unqualified modified report (Group 1) did impact the investment decision. Table 3 displays the distributions of the mean response and standard deviation of the responses to questions 3, 4, and 6. The results from the Mann-Whitney test indicate that the difference in presentation of the auditors’ report significantly influenced the investors’ assessment of the investment risk (H2 ) and the likelihood that the company can improve its profitability (H3 ) (p-value = 0.004 and 0.015, respectively). The difference in presentation format (H4 ), however, only marginally influenced the search for additional information about the going concern uncertainty (p-value = 0.084). In summary these results indicate that, relative to the type of standard auditor’s report issued in Canada, the inclusion of the explanatory paragraph in the auditor’s report detailing the going concern uncertainty significantly influences the perception of risk associated with the company and significantly decreases the perception of the likelihood that the company can improve its profitability. Table 4 reveals the relative importance of the different items in the financial statements to the investment decision for groups 1 and 2. On average (as given in column 3 of Table 4), subjects weighted the contingencies footnote (footnote 9) most heavily as impacting their investment decision (with a weight of 17.56). Next, the Statement of Earnings and Retained Earnings, the Balance Sheet and the Statement of Changes in Financial Position were also considered important in the investment decision with weights of 15.58, 13.87 and 13.50, respectively. Finally the going concern assumption footnote (footnote 1A) was also considered to be important with a weight of 9.60.
Behavioral Implications of Alternative Going Concern Reporting Formats
67
Table 4. Descriptive Statistics: Mean (Standard Deviation) of Decision Weights by Report Type. Auditor Report
Balance sheet Footnote 1A (Going concern assumption) Footnote 1B (Summary of significant accounting policies) Footnote 2 (Restructuring charge) Footnote 3 (Income taxes) Footnote 4 (Accounts receivable) Footnote 5 (Inventories) Footnote 6 (Capital assets) Footnote 7 (Accounts payable) Footnote 8 (Long term debt) Footnote 9 (Contingencies) Audit report Statement of earnings and retained earnings Statement of changes in financial position
Overall Mean (Std. Dev.)
Standard
Modified
16.30 (12.15) 4.38 (6.71) 0.15 (0.46)
11.60 (10.36) 14.44 (14.03) 1.80 (3.51)
13.87 (11.40) 9.60 (12.13) 1.01 (2.66)
2.92 (3.36) 1.15 (2.46) 4.61 (4.42) 5.38 (8.11) 2.88 (4.05) 2.88 (3.70) 6.30 (6.41) 15.11 (15.31) 2.96 (6.23) 17.80 (10.85) 16.53 (8.98)
4.76 (6.95) 1.62 (2.68) 4.55 (5.84) 2.50 (3.57) 2.32 (3.77) 3.12 (4.12) 4.50 (5.44) 19.83 (15.73) 3.67 (4.72) 14.03 (12.92) 10.67 (8.10)
3.87 (5.55) 1.39 (2.56) 4.58 (5.16) 3.88 (6.30) 2.59 (3.88) 3.05 (3.89) 5.37 (5.94) 17.56 (15.57) 3.33 (5.46) 15.58 (12.01) 13.50 (8.05)
With respect to the auditor report, subjects gave different weights to the financial statements’ items depending on the audit report assigned. Those subjects who received a standard report rated the statements of earnings and retained earnings (17.80), the Statement of Changes in Financial Position (16.56), and the Balance Sheet (16.30) as the three most important items to their investment decision. Whereas, those subjects who received a modified report ranked the contingencies footnote (19.83), the going concern assumption note (14.44), and the Statement of Earnings and Retained Earnings (14.03) as the three most important items to their decision. One might suspect that the explanatory paragraph of the modified report directed the subjects’ attention to the information disclosed in the two notes related to going concern uncertainties. H5 and H6 examined whether the uncertainty modification affected the weight investors attached to the going concern assumption footnote (H5 ) and the audit report (H6 ) in their investment decision. The results of the Mann-Whitney test, as given in Table 5, indicate that the type of report has a significant effect on the weight given to footnote 1A entitled Going concern assumption (z = −3.48 and p-value = 0.000) providing support for H5 . The difference due to the type of audit report, however, provided only marginally significance for H6 (2.96 v 3.67; z = −1.41; p-value = 0.078). These results suggest that the uncertainty
68
CHANTAL VIGER ET AL.
Table 5. Results Related to the Nature of the Effect of the Uncertainty Modification. Group
Mean
Standard Deviation
z-Statistic (p-Value) One Tail Test −3.48
Footnote 1A (Going concern assumption) weight
Standard
4.38
6.71
Modified
14.44
14.03
(0.000)
Audit report weight
Standard Modified
2.96 3.67
6.23 4.72
−1.41 (0.078)
modification primarily operates to direct investors’ attention to the going concern assumption footnote.
DISCUSSIONS, CONCLUSIONS, AND IMPLICATIONS The auditor’s report for Canadian companies currently is unqualified in the presence of going concern uncertainties as long as the auditor is satisfied with financial statement disclosure. The only requirement is that the going concern contingency be described in a note integrated with the other notes to the financial statements. The present method has been criticized as being passive and the signal mute (Boritz, 1991). In an attempt to adopt a more positive stance and accentuate the signal about the going concern contingency, the CICA considered and then withdrew two exposure drafts dealing with going concern uncertainties. The recently rescinded accounting standards proposed a separate stand-alone going concern note combined with reference to the going concern note on the face of the Balance Sheet and Income Statement. But the proposed auditing standard did not require the auditor’s report to be modified. Although the proposed standards have been withdrawn, this issue remains under consideration by the CICA because one of the principal objectives adopted by the AcSB is “to work toward the elimination of significant differences in accounting standards internationally” (CICA, 2000, p. 1). The findings of this study provide evidence that the provision of the explanatory (emphasis of matter) paragraph in the presence of going concern uncertainties did have an influencing effect on the non-professional investor subjects’ decision to invest and perceptions toward the riskiness of the company. These results are consistent with judgment and decision making theory, which holds that multiple reinforcements accentuate the signal and contradict the argument that the audit
Behavioral Implications of Alternative Going Concern Reporting Formats
69
report is redundant. Although the notes to the financial statements already disclose the same information as in the fourth paragraph of the auditor’s report, it appears that they do not provide the same level of warning to the financial statements’ users. Hence, as part of the CICA’s reassessment of the current going concern reporting and auditing standards in Canada, serious consideration should be given to require auditors to modify the audit report when facing going concern uncertainties. Such reporting, if adopted by the CICA, would also provide closer harmonization of Canadian standards with those of other countries. This study was not without limitations. First, the experimental design did not consider all the costs and benefits associated with real investors’ decision making. Second, the information provided to participants was less than the amount of information usually available to investors. Third, one question asked respondents about risk. Although the research attempted to measure “investment” risk, it is possible that the respondents could have interpreted it as “business” risk. While the extent of this misinterpretation is unknown, it could have impacted the results of the study. Finally, the wording of our cover letter, due to our ethical obligations, may have alerted the readers to the subject matter of this study. The general finding in this study is that the modification of the auditor report with a fourth explanatory paragraph appears, in the Canadian context, to cause investors to focus more on the going concern contingency and therefore sends a stronger signal. This finding is relevant since the issue of the auditor’s role and responsibility in communicating information on uncertainties (including the going concern status of client) is still an open debate in Canada. In a recently published paper, Anandarajan et al. (2002) examined loan officers’ reaction to the format based on the current Canadian standard, proposed exposure draft, and the United States standard. They concluded that bankers did not perceive a difference between the current Canadian standard and proposed Canadian exposure draft; but they did perceive a difference between the Canadian reporting formats and the format adopted by the United States and other countries. One limitation of that study was the selection of only one sophisticated financial statement user group (bank loan officers). This study extends those results by looking at the reaction of a non-professional investors group and found that the non-professional investors focused more on the going concern contingency under the United States format. In conclusion, this study makes two important contributions. First, investors are more likely to invest in a company experiencing a going concern problem when it is reported under the Canadian format than under the United States format. These results are even more pronounced than those found in Anandarajan et al. (2002) for loan officers, which suggested that the information may be misleading. From a practical viewpoint, the findings add to the debate on whether Canadian standard setters should change Canadian reporting in the presence of going
70
CHANTAL VIGER ET AL.
concern uncertainties to converge with the format adopted by the United States and other Western countries. Second, from an academic viewpoint, this study contributes to the literature on the incremental information provided by alternate forms of going concern audit reports. Overall, the preponderant view is that the modification or explanatory paragraph in the auditor’s report does not have incremental information content to the financial statement reader (Abdel-Khalik et al., 1986; Elias & Johnston, 2001; Houghton, 1983; LaSalle & Anandarajan, 1997; Libby, 1979; among others). In this study, we find that, in the Canadian context, the format of going concern presentation, especially when information is reinforced by repetition has incremental information content to a reader. In addition, these results add to and corroborate judgment and decision making theory, which holds that multiple reinforcements accentuate the signal provided that there is no information overload.
NOTES 1. The two relevant auditing standards in the U.S. are Statement of Auditing Standard (SAS) No. 58 titled Reports on Audited Financial Statements and SAS No. 59 entitled The Auditor’s Consideration of an Entity’s Ability to Continue as a Going Concern (see American Institute of Certified Accountants, 1988). 2. The risk that the study is attempting to measure is “investment risk.” It is defined as the investors’ perceptions of the financial viability of an entity based on their reading of the financial statements and notes to the financial statements. 3. A number of procedures were followed to ensure that the students took the task seriously. For example, the experiment was officially conducted as part of the requirements of a class. After the students completed the study, a discussion was held by the one of the researchers with the students to discuss the study. Students were requested to justify their answers with respect to the decision to invest. From the level of participation, we concluded that the students not only understood the material but also took their participation seriously. In fact, the majority of students who declined to invest cited the going concern uncertainty footnote as the primary reason for their decision. These responses were further corroborated by the evidence gathered from respondents’ reaction to question 5 in the response instrument, which requested the participants to allocate points among the information given to them based on the item’s relative importance to their investment decision. Students placed greater weight on items that were generally relevant to the investment decision with the heaviest weight on the contingency footnote. This suggests that an overall general understanding of the material was present among the participants in the study.
ACKNOWLEDGMENTS The authors thank the editor, the associate editor, and the two anonymous reviewers for their insightful and constructive comments.
Behavioral Implications of Alternative Going Concern Reporting Formats
71
REFERENCES Abdel-Khalik, A. R., Graul, P. R., & Newton, J. D. (1986). Reporting uncertainty and assessment of risk: Replication and extension in a Canadian setting. Journal of Accounting Research, 24, 372–382. American Institute of Certified Public Accountants (AICPA) (1988). Reports on audited financial statements. In: Statement on Auditing Standards (Nos. 58 and 59). New York, NY: AICPA. Anandarajan, A., Viger, C., & Curatola, A. P. (2002). An experimental investigation of alternative going-concern reporting formats: A Canadian experience. Canadian Accounting Perspectives, 1(2), 141–162. Bamber, E. M., & Stratton, R. A. (1997). The information content of the uncertainty-modified audit report: Evidence from bank loan officers. Accounting Horizons, 11(2), 1–11. Bertholdt, R. H. (1979). Discussion of the impact of uncertainty reporting on the loan decision. Journal of Accounting Research (Suppl.), 58–63. Boritz, J. E. (1991). The going concern assumption. In: Canadian Institute of Chartered Accountants Research Report. Toronto: CICA. Canadian Institute of Chartered Accountants (CICA) (1995). Proposed auditing recommendations auditor’s responsibility to evaluate the going concern assumption. Auditing Standards Board (September). Canadian Institute of Chartered Accountants (CICA) (1996). Proposed accounting recommendations – Going concern. Accounting Standards Board (January). Canadian Institute of Chartered Accountants (CICA) (1999). Department digest: A summary of current CICA projects and initiatives. The Canadian Accountant (Fall), 2. Canadian Institute of Chartered Accountants (CICA) (2000). New era begins for the Accounting Standards Board. The Canadian Account (Winter), 1. Canadian Institute of Chartered Accountants (CICA) (2003). CICA handbook. Toronto: CICA. Chewning, E., & Harrell, A. (1990). The effect of information overload on decision makers’ cue utilization levels and decision quality in a financial distress decision task. Accounting Organizations & Society, 15(6), 527–542. Elias, R. Z., & Johnston, J. G. (2001). Is there incremental information content in the going concern explanatory paragraph? Advances in Accounting, 18, 105–117. Frownfelter, C. A., & Fulkerson, C. L. (1998). Linking the incidence and quality of graphics in annual reports to corporate performance: An international comparison. Advances in Accounting Information Systems, 6, 129–152. Gul, F. A. (1987). The effects of uncertainty reporting on lending officers’ perception of risk and additional information required. ABACUS, 23(2), 172–179. Hirst, E., & Hopkins, P. (1998). Comprehensive income reporting and analysts’ valuation judgments. Journal of Accounting Research, 36, 47–75. Hopkins, P. (1996). The effect of financial statement classification of hybrid financial instruments on financial analysts’ stock price judgments. Journal of Accounting Research (Suppl.), 33–50. Houghton, K. A. (1983). Audit reports: Their impact on the loan decision process and outcome: An experiment. Accounting and Business Research, 66, 15–20. Hunton, J. E., & McEwen, R. A. (1997). An assessment of the relation between analysts’ earnings forecast accuracy, motivational incentives, and cognitive information search strategy. The Accounting Review, 72(October), 497–516. International Federation of Accountants (IFAC) (1999). International statement on auditing 570. Going concern. In: IFAC Handbook Technical Pronouncements. New York: IFAC.
72
CHANTAL VIGER ET AL.
Iselin, E. (1993). The effects of the information and data properties of financial ratios and statements on managerial decision quality. Journal of Business Finance and Accounting, 20(2), 249–266. LaSalle, R. E., & Anandarajan, A. (1997). Bank loan officers’ reactions to audit reports issued to entities with litigation and going concern uncertainties. Accounting Horizons, 11(2), 33–40. Levitt, A. (1997, September 26). The importance of high quality accounting standards. Speech to the Inter-American Development Bank, Washington, DC. Libby, R. (1979). The impact of uncertainty reporting on the loan decision. Journal of Accounting Research (Suppl.), 35–57. Lipe, M., & Salterio, S. (2000). Balanced scorecard: Judgmental effects of common and unique performance measures. The Accounting Review (July), 283–298. Maines, L. A., & McDaniel, L. (2000). Effects of comprehensive-income characteristics on nonprofessional investors’ judgments: The role of financial-statement presentation format. The Accounting Review, 75(April), 179–207. Payne, J., Bettman, J., & Johnson, E. (1993). The adaptive decision maker. Cambridge: Cambridge University Press. Pringle, L. M., Crum, R. P., & Swetz, R. J. (1990). Do SAS No 59 format changes affect the outcome and the quality of investment decisions. Accounting Horizons (September), 68–76. Ricchiute, D. (1992). Working paper order effects and auditors’ going concern decisions. The Accounting Review (January), 46–58. Slovic, P., & MacPhillamy, D. (1974). Dimensional commensurability and cue utilization in comparative judgment. Organizational Behavior and Human Performance, 11, 172–194. Stocks, M. H., & Tuttle, B. (1998). An examination of information presentation effects on financial distress predictions. Advances in Accounting Information Systems, 18, 107–128. Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5, 207–232. Walters-York, L. M., & Curatola, A. P. (1998). Recent evidence on the use of students as surrogate subjects. Advances in Accounting Behavioral Research, 1, 123–143. Walters-York, L. M., & Curatola, A. P. (2000). Theoretical reflections on the use of students as surrogate subjects in behavioral experimentation. Advances in Accounting Behavioral Research, 3, 243–264.
APPENDIX 1 How many university courses have you had in management and financial accounting? How many university courses have you had in auditing? What is your age? 2 Would you be willing to invest in this company? (Check “Yes” or “No”). YES
NO
3 Please circle on the scale shown below your perception of risk of the company. LOW RISK
HIGH RISK 1
2
3
4
5
Behavioral Implications of Alternative Going Concern Reporting Formats
73
4 Please circle on the scale shown below your perception of the likelihood that the company can improve its profitability. VERY UNLIKELY
VERY LIKELY 1
2
3
4
5
5 Assign 100 points across the following financial statements items according to the items’ relative importance to your investment decision: (1) Balance sheet (2) Footnote 1A (Going Concern Assumption) (3) Footnote 1B (Summary of Significant Accounting Policies) (4) Footnote 2 (Restructuring Charge) (5) Footnote 3 (Income Taxes) (6) Footnote 4 (Accounts receivable) (7) Footnote 5 (Inventories) (8) Footnote 6 (Capital assets) (9) Footnote 7 (Accounts payable) (10) Footnote 8 (Long term debt) (11) Footnote 9 (Contingencies) (12) Audit report (13) Statement of earnings and retained earnings (14) Statement of changes in financial position 6 Please circle on the scale shown below the extent to which you are likely to search for additional information about matters addressed in the auditor’s report. VERY UNLIKELY
VERY LIKELY 1
2
3
4
5
END OF QUESTIONNAIRE. THANK YOU FOR YOUR HELP
MANAGEMENT FRAUD RISK FACTORS: AN EXAMINATION OF THE SELF-INSIGHT OF AND CONSENSUS AMONG FORENSIC EXPERTS Sally A. Webber, Barbara Apostolou and John M. Hassell ABSTRACT Over the past two years, fraudulent financial reporting has become a major concern of both the Securities and Exchange Commission and investors. These concerns have been spurred by evidence that several high-profile companies such as Enron, Tyco, WorldCom, and HealthSouth have published false and/or misleading financial reports. Statement on Auditing Standards (SAS) No. 82 specifies that auditors have a responsibility to assess the likelihood of management fraud and identifies specific risk factors that should be considered when making that assessment. Apostolou et al. (2001b) examined how internal and external auditors rate the relative importance of these factors. This study extends Apostolou et al. (2001b) by examining how forensic experts at four Big 5 professional service firms assess the factors specified in SAS No. 82. These assessments produced two different models of relative importance: (a) a statistical model (produced by the Analytic Hierarchy Process); and (b) a subjective model (based on subjects’ assessment of the relative weights). These models are then used to assess Advances in Accounting Behavioral Research Advances in Accounting Behavioral Research, Volume 7, 75–96 Copyright © 2004 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1474-7979/doi:10.1016/S1474-7979(04)07004-8
75
76
SALLY A. WEBBER ET AL.
the self-insight of and the degree of agreement among the forensic experts. The results indicate that forensic experts have a moderately high degree of self-insight. A moderate to high degree of consensus among experts’ judgments about the relative importance of fraud risk factors was noted.
INTRODUCTION Statement on Auditing Standards (SAS) No. 82, “Consideration of Fraud in a Financial Statement Audit” (AICPA, 1997), identifies 25 risk factors that an auditor should consider when making a management fraud risk assessment.1 SAS No. 82 provides specific guidance to auditors regarding how they should fulfill the responsibility to obtain reasonable assurance about whether the financial statements are free of material misstatement, whether due to error or fraud. Auditors are required to consider SAS No. 82 risk factors that “are discriminating and have been found to be present frequently in actual instances of fraud” (Mancino, 1997, p. 32). Based upon identification of the presence of risk factors, the auditor is required to assess the risk of material misstatement due to fraud and to document the nature of the response. The fraud risk assessment should evolve throughout the course of the audit, and is not expected to be assessed at a level (i.e. high, medium, low), as is the case with inherent or control risk. Since late 2001, scrutiny has been placed on the auditing profession as a result of highly publicized corporate frauds (e.g. Enron, WorldCom, HealthSouth) and the corresponding apparent failure of the auditors to detect and report the wrongdoing in a timely manner. Significant shareholder losses and the decline in investor confidence led Congress to enact the Sarbanes-Oxley Act in 2002, which created the Public Companies Accounting Oversight Board to increase accountability of the public accounting profession. Meanwhile, the growth of the specialized practice of forensic accounting has been tremendous as a result of the increased concern about corporate fraud and its impact on the integrity of the financial markets. Forensic experts are individuals in professional services firms with specific training in and experience with fraud investigative techniques. These individuals work with fraud risk and its effects on an ongoing basis, while internal and external auditors may encounter it rarely. Thus, it seems appropriate to consider how forensic experts model the SAS No. 82 management fraud risk factors (AICPA, 1997). Apostolou et al. (2001b) studied how three groups of auditors (50 regional/local firm external auditors, 43 Big 5 firm external auditors, and 47 internal auditors) rated the relative importance of 25 management fraud risk factors in SAS No. 82 (AICPA, 1997). The AHP was used to model the judgment of each subject, and the individual models were then combined to produce mean decision models for
Management Fraud Risk Factors
77
each auditor group. Statistical analysis showed no significant differences between the auditor groups. This study replicates and extends Apostolou et al. (2001b) and investigates the degree of agreement (self-insight and consensus) among 35 practicing Big 5 firm forensic experts regarding the relative importance of the 25 management fraud risk factors. The forensic experts who participated serve crucial roles in their firms because they are summoned whenever fraud is suspected or discovered. Management fraud (as opposed to employee fraud) is the topic of interest in the current study primarily because its effects tend to be more severe, as evidenced by the impact of the highly publicized failures in the Enron era. If the knowledge held by forensic experts is to be passed on to and aid in the training of less experienced auditors, these experts must understand how they personally make decisions regarding risk (i.e. demonstrate a relatively high level of self-insight into their decisions about relative risk (Colbert, 1988)). The results of prior research regarding the self-insight of experts, which has been assessed in several different ways, have been mixed with results documenting low to high self-insight (Ashton & Brown, 1980; Ashton & Kramer, 1980; DeZoort, 1998; Hamilton & Wright, 1982; Mear & Firth, 1987; Slovic et al., 1972). Reilly and Doherty (1992) note that seemingly high correlations reported in some prior studies should be viewed with caution because it is easy to obtain high correlations in studies with few attributes. Many prior accounting studies that have shown auditors have relatively high self-insight have used six or fewer cues (Ashton & Kramer, 1980; Colbert, 1988; Hamilton & Wright, 1982). This study includes a very large number of attributes and an expert population where self-insight has not been previously documented. Thus we are unable to predict a level of self-insight from prior studies. If this research is to assist the profession in understanding the relative importance of these 25 fraud risk factors, it is important that the experts surveyed have the ability to describe how they view the importance of the factors (i.e. exhibit self-insight). Since we are unable to assess the accuracy of the subjects’ models in predicting fraud, consensus is measured as a surrogate. Although consensus does not guarantee accuracy, the relative weights produced from this study would be of little assistance if our experts showed little agreement. Two different measures of relative importance of the 25 risk factors were obtained from the forensic experts: (1) statistical weights derived using a mathematical decision model based on paired comparison data provided by the subjects; and (2) subjective weights assigned by the experts. The experts’ degree of agreement was assessed in two separate ways: (1) computing measures of self-insight across models; and (2) computing measures of consensus across experts. Self-insight was computed in two ways: (1) computing the correlations between the statistical and subjective model weights; and (2) after using simulation analysis
78
SALLY A. WEBBER ET AL.
to compute predictions from each expert’s statistical and subjective models, computing correlations between the predictions of the two models for each individual. Finally, the degree of consensus among the experts for both statistical and subjective models was assessed in two ways: (1) computing correlations between the model weights; and (2) after using simulation analysis to compute predictions from each expert’s statistical and subjective models, computing correlations between the predictions of the two models for all possible pairs of individuals. The results demonstrate moderately high self-insight and moderate to high consensus. Agreement (self-insight) between an individual expert’s models was moderately high for decision weights and high for predictions, and agreement (consensus) across individuals was higher than that found in prior studies. The remainder of the paper is organized as follows. Prior research is briefly summarized in the next section. The research method is then described, followed by a summary of the results and conclusions.
PRIOR RESEARCH Nieschweitz et al. (2000) review literature that addresses the fraud detection responsibility of external auditors. This prior research emphasizes various approaches to the study of fraud and fraud risk. Some researchers examined the efficacy of decision aids (Eining et al., 1997; Green & Choi, 1997; Pincus, 1989). Others examined how auditors make the fraud risk assessment or can improve upon it (Bernardi, 1994; Hooks et al., 1994; Reckers & Schultz, 1993; Zimbelman, 1997). Another avenue of research is the use of surveys to determine relevant risk factors (Apostolou & Hassell, 1993; Apostolou et al., 2001a; Hackenbrack, 1993; Loebbecke et al., 1989). Finally, the analysis of archival data sources to discover factors identified with management fraud also has been explored (Beasley et al., 1999; Palmrose, 1987). Landsittel and Bedard (1997, p. 3) noted that research can contribute to the ongoing improvement of auditing standards to guide the fraud risk assessment process, including: (1) identifying the most important fraud risk factors; and (2) weighting the risk factors as to their relative importance to the fraud risk assessment. Eining et al. (1997, p. 4) observed that checklists, which are used by auditors to assist in making a fraud risk assessment, give “no mechanical assistance for weighting and combining the red flag cues into an overall assessment.” Heiman-Hoffman et al. (1996) surveyed Big 6 auditors to obtain an explicit ranking of 30 fraud risk factors, but the ranking method used precluded determination of the relative importance of the risk factors. Apostolou et al. (2001a) measured the relative importance of SAS No. 82 risk factors to three groups of experts (Big
Management Fraud Risk Factors
79
5 field auditors, regional/local field auditors, and internal auditors). The current study extends these studies by assessing the relative importance of SAS No. 82 risk factors to the most senior forensic auditors in Big 5 firms.2
RESEARCH METHOD Data Collection The Analytic Hierarchy Process Developed by Thomas Saaty, the Analytic Hierarchy Process (AHP) is a mathematical tool used to model complex judgments (Saaty, 1986, 1988, 1994). Apostolou and Hassell (1993) review AHP research in accounting, and Apostolou et al. (2001b) provide an extended illustration showing the application of AHP. The AHP is useful when qualitative criteria (e.g. management fraud risk factors) enter into judgments, as is the case of assessing the risk of material misstatement due to management fraud. The technique requires that the judgment criteria be organized in a hierarchical fashion, from most general at the top of the hierarchy to most specific at the bottom. An AHP model is constructed by having each participant consider all possible pairs of risk factors within each hierarchical grouping and then rate the factors within each pair as to their importance to the immediately superior level in the hierarchy using a nine-point intensity rating scale. SAS No. 82 identifies management fraud risk factors that should enter into the assessment of the risk of material misstatement due to management fraud. The same AHP hierarchy for the management fraud risk factors (see Fig. 1) that was developed and described in Apostolou et al. (2001b, p. 8) was used. The hierarchy consists of three levels, with level one the most general and level three the most specific. Level one represents the goal of the decision process: assess risk of material misstatement due to management fraud in a financial statement audit. Level two consists of the SAS No. 82 categories of fraud risk factors: (a) management characteristics and influence over the control environment; (b) industry conditions; and (c) operating and financial stability characteristics. Level three includes a total of 25 specific risk factors (AICPA, 1997, ¶17) that fall within the three level-two categories.
Model Weights of Relative Importance Two methods to measure the relative importance of the management fraud risk factors to forensic experts were used. First, statistical model weights of relative
80 SALLY A. WEBBER ET AL.
Fig. 1. Management Fraud Risk Assessment Hierarchy. Note: (a) The risk factors are condensed versions of the definitions used in SAS No. 82 (AICPA, 1997). (b) The hierarchy is consistent with Apostolou et al. (2001b, p. 8).
Management Fraud Risk Factors
81
importance were obtained using the AHP to assess the relative importance of the management fraud risk factors to forensic experts from Big 5 firms. Second, subjective weights of relative importance were obtained by asking the forensic experts to allocate 100 points (100%) of relative importance across the 25 management fraud risk factors. Allocating 100 points across cues or factors is a commonly employed technique in behavioral research. The Research Instrument Data were collected from forensic experts with a three-part survey-type instrument, designed to facilitate the calculation of AHP models. The instrument consisted of three sections. Part one asked the subjects to allocate 100 points over the risk factors within each of the three categories and among the three categories. The risk factors and descriptions were reproduced verbatim from SAS No. 82 (AICPA, 1997. ¶17). Part two presented the pairwise comparisons within and among the three categories. Again, the subjects were provided with both the verbatim SAS No. 82 risk factor description and the AHP scale definitions on each page in which comparisons were required. The third section consisted of demographic questions. In part one, each participant allocated: (1) 100 points over the three risk categories; and (2) 100 points across the individual risk factors within each category. These responses produced subjective model weights of relative importance. Because the 25 factors appear in three different categories, two ways exist to allocate 100 points across the factors. One is to ask the subjects to allocate 100 points across all 25 factors. Another is to allocate 100 points across the factors in each category, and then to use the subjective allocations within each category to prepare an allocation across all 25 factors. The latter method was used because it is consistent with the SAS No. 82 categorical presentation (AICPA, 1997, ¶17). It is possible, however, that this process would result in a different allocation than if the 100 points had been allocated across all 25 factors at once. In part two of the instrument, participants made 52 paired-comparisons, which were used to produce statistical model weights of relative importance for each risk factor. An AHP model is produced by having each subject make all possible pairwise comparisons of the level 2 categories (3 comparisons) and the level 3 risk factors (126 comparisons), which would require a total of 129 pairwise comparisons. Subject fatigue is a significant concern when the number of comparisons is so large. When hierarchies contain a large number of comparisons, a function in the AHP’s Expert Choice™ (1998) software used for the computations called “link elements” allows the researcher to reduce the number of comparisons while generating sufficient redundancy among comparisons to produce an AHP model. The model used consists of a total of 52 comparisons: level 2 (3 comparisons) and level 3 (49 comparisons). The only grouping in which the “link elements” feature
82
SALLY A. WEBBER ET AL.
was used is operating characteristics and financial stability category, wherein the number of required comparisons was 28 instead of 105. In part three, demographic information was collected. Questions related to professional experience, involvement with fraud risk assessment, and the instrument were asked to assist in understanding the data. Forensic Expert Participants This research study was sponsored by the AICPA as part of its research into the effectiveness of SAS No. 82. The AICPA acted as an intermediary to solicit participation by forensic experts from Big 5 professional service firms. AICPA staff and representatives of each Big 5 firm reviewed the research instrument before it was administered. Four of the five firms agreed to participate by distributing the research instrument to their most senior forensic experts. Thirty-five individuals returned usable research instruments. A forensic expert is an individual with training and experience in assessing fraud risk, conducting fraud investigations, and evaluating the impact of actual fraud. By definition, a forensic expert is most likely to be consulted when fraud is alleged or suspected. Thus, an audit team member may notice indicators of fraud during a routine audit and call upon the forensic expert to investigate. An employee of a client may make an allegation of fraud (i.e. whistleblower), to which a forensic expert may respond. In a scenario such as Enron, a forensic expert will be employed to assess the damage from a known fraud. Forensic techniques extend beyond those used in a routine audit, and are not typically included in a usual audit program. Thus, individuals with the designation of forensic expert may be accountants or attorneys or former FBI agents who have obtained training and experience beyond the routine. For this study, all participants were designated by their respective firms as experts in the management fraud area, with mean auditing experience of 12 years, and were at least managers in their firms. Most participants had been members of an audit team when fraud was discovered (66%) and had personally been involved in making a fraud risk assessment (71%). Thirteen participants had specific certification in fraud investigation (e.g. Certified Fraud Examiner).
Data Analysis Self-Insight Two different measures of self-insight were used to assess the degree of agreement between forensic experts’ decision models. In their 1971 review of the judgment and decision-making literature, Slovic and Lichtenstein concluded that subjects have poor self-insight and that they overestimate the importance placed on minor
Management Fraud Risk Factors
83
cues and underestimate reliance on a few major cues. This conclusion was based upon a review of judgment and decision-making literature that primarily compared statistical and subjective model weights. Reilly and Doherty (1992, p. 286) note that although “there have been doubts expressed about the generalization that people have little or no insight regarding their judgment policies,” this conclusion seems to have become generally accepted. Subjective weights most commonly are derived by asking subjects to allocate 100 points (or 100%) across the judgment criteria. The statistical model most commonly used is regression analysis, although other models such as the AHP have been used (Apostolou & Hassell, 1993; Apostolou et al., 2001a, b; Hassell & Arrington, 1989). Reilly and Doherty (1992) indicate that four common ways of computing self-insight include the following: (1) correlation of the statistical model weights directly with the subjective weights; (2) R2 values from predictions based upon statistical model and subjective weights; (3) correlation statistics of predictions using statistical model and subjective weights; and (4) the judgment predictions for holdout samples based upon statistical model and subjective weights.3 Schmitt and Levine (1977) argue that although the statistical weights and subjective weights appear entirely different, the two sets of predicted values resulting from applications of those weights may be highly correlated. Thus, Schmitt and Levine suggest that the correlation between predicted values (method 3 identified by Reilly and Doherty) may reflect more self-insight than is apparent when comparing statistical and subjective weights, and this correlation provides a more reasonable method of assessing self-insight. Based on Schmitt and Levine’s argument, Surber (1985) reevaluated several prior studies by comparing correlations between predicted values rather than between statistical and subjective weights. Surber found that the correlation between the predicted values showed higher selfinsight than the previous studies that correlated statistical and subjective weights had suggested. Two within-subject analyses methods were used to assess self-insight in this study: (1) Spearman correlations between the AHP and subjective model weights for each expert, which reflects Reilly and Doherty’s suggestion number one; and (2) Spearman correlations for each expert’s AHP-model predictions and subjectivemodel predictions from a simulation analysis, which reflects Reilly and Doherty’s suggestion number three. Because measures of actual risk are required in calculating R2 values or to use as a holdout sample, methods 2 and 4 could not be conducted. Degree of Agreement Among Forensic Experts’ Decision Models (Consensus) Two between-subject analyses were conducted to measure the degree of agreement across experts’ decision models. First, for AHP and subjective models separately, the Spearman correlations between each possible pair of experts’ decision model
84
SALLY A. WEBBER ET AL.
weights were computed (i.e. for all possible pairs of 35 experts, the correlations between the 25 decision model weights). The average of all of these Spearman paired correlations is a measure of consensus-in-principle (Einhorn, 1974). Second, simulation analysis was used to calculate predictions from each expert’s AHP and subjective model weights to determine the level of agreement among the predictions that result from the subjects’ models. The best way to test each expert’s model would be to observe actual cases of management fraud and the factors or combinations of factors associated with the fraud, and then use those to compute how well an individual expert’s decision model predicted the fraud. Unfortunately, data from actual fraud risk assessments are not available due to its proprietary nature. As an alternative to considering actual fraud risk assessments, simulation analysis was used to construct hypothetical cases and to determine how similarly the experts’ AHP and subjective decision models predicted outcomes. A total of 1,000 randomly generated hypothetical cases (trials) were constructed and two different models were used in the simulation. First, a 3-values model was used. For each of the 25 factors in each case, a risk level was randomly assigned to that factor: 0.8 = high risk, 0.5 = medium risk, and 0.2 = low risk. These numerical values simulate an assessment of the level of risk for a particular risk factor. The simulated risk values were then input into each subject’s AHP and subjective decision models where they were weighted using the relative weights from each technique. Although this process does not correspond to that actually used by any auditing firm, these values are useful to simulate model predictions. The randomly assigned risk level was multiplied by the appropriate decision weight for each participant’s 25 AHP factor weights and the results summed. The final numerical score theoretically could range from 0.2–0.8. This process was repeated using subjective weights. In a second 2-values model, the simulation analysis was replicated by randomly assigning a value of either zero (no risk) or one (high risk) to each of the 25 risk factors over 1,000 trials. Then, for AHP and subjective models separately, Spearman correlations between each possible pair of experts’ predictions were computed. The average of all of these Spearman correlations is a measure of consensus-in-predicted values (Webber & Hassell, 1997).
RESULTS Forensic Experts’ AHP and Subjective Model Weights Table 1 presents the mean (median) AHP and subjective decision model weights, computed across subjects as an average of the 35 forensic experts’ models. These
Management Fraud Risk Factors
85
Table 1. Aggregate AHP and Subjective Decision Model Weights. Risk Factor Category
AHP Mean (Median)
Subjective Mean (Median)
Management Characteristics and Influence Over the Control Environment Significant compensation tied to aggressive accounting 0.180(0.160) practices Management’s failure to display appropriate attitude about 0.165(0.140) internal control Nonfinancial management’s influence over GAAP 0.068(0.048) principles or estimates High turnover of senior management 0.042(0.031) Strained management/auditor relationship 0.050(0.043) Known history of securities law violations 0.080(0.039)
0.049(0.050) 0.065(0.060) 0.060(0.050)
Subtotala
0.585
0.495
0.027(0.020)
0.045(0.040)
0.046(0.034)
0.061(0.060)
0.034(0.020) 0.038(0.028)
0.050(0.050) 0.051(0.050)
0.145
0.207
0.018(0.011) 0.023(0.014) 0.019(0.012) 0.029(0.018) 0.028(0.021) 0.006(0.005) 0.010(0.006) 0.016(0.010) 0.024(0.014)
0.024(0.025) 0.028(0.026) 0.020(0.017) 0.028(0.024) 0.026(0.024) 0.011(0.009) 0.014(0.015) 0.020(0.018) 0.025(0.020)
0.020(0.008) 0.011(0.007) 0.012(0.007) 0.024(0.013) 0.016(0.011) 0.014(0.011)
0.018(0.015) 0.014(0.013) 0.014(0.015) 0.018(0.017) 0.019(0.017) 0.019(0.017)
Industry Conditions Effect of new accounting requirements on financial stability/profitability High degree of competition/market saturation and declining margins Company in declining industry Rapid changes in industry, vulnerability to changing technology & product obsolescence Subtotala Operating and Financial Stability Characteristics Significant accounts based on estimates Significant related-party transactions “Substance over form” questions Presence of aggressive incentive programs Potential adverse consequences of poor financial results High vulnerability to interest rates Unusually high dependence on debt Threat of imminent bankruptcy Poor/deteriorating financial position with management guarantee of firm’s debt Bank accounts or operations in tax-haven jurisdictions Overly complex organization Difficulty in determining organizational control Negative operating cash flow but reported earnings Significant pressure to obtain capital Unusually rapid growth/profitability relative to industry
0.120(0.100) 0.130(0.100) 0.071(0.064)
Subtotala
0.270
0.298
Total
1.000
1.000
86
SALLY A. WEBBER ET AL.
weights could be interpreted as two alternative presentations of the relative importance of each management fraud risk factor. Generally, the results reported throughout do not differ based upon the participant’s firm (i.e. no firm effect), although the power of statistical tests is weak due to small sample sizes. The weights shown in Table 1 are consistent with prior research which has shown that the distribution of subjective weights tends to be more “even” (i.e. flatter) than those of statistical weights (Mear & Firth, 1987; Reilly & Dougherty, 1992). Although the relative ranking of the three major risk factor categories is consistent between the two methods, the subjective weights for the categories are more “even.” The highest category weighting (management characteristics and influence over the control environment) is lower and the lowest category weighting (industry conditions) is higher for the subjective method than for the AHP method. Because of the more even distribution of rankings throughout the subjective method, all of the weights in the industry conditions category are higher for the subjective method than for the AHP method. For both the management characteristics and influence over the control environment and operating and financial stability categories, the range of weights for the AHP method is larger than for the subjective method.4
Self-Insight Self-insight was analyzed using two different measures, both of which are withinsubject comparisons. The first is a correlation between the individual expert’s model weights and the second is a correlation between the prediction of the AHP model and the prediction of the subjective model. Correlations Between Individual Expert’s Model Weights (Within Subjects) Table 2 reports a Spearman correlation for the AHP and subjective model weights for each of the 35 experts: 33 of 35 correlations are statistically significant at the p ≤ 0.05 level.5 Further, the mean (median, minimum, maximum) Spearman correlation across the 35 participants is 0.694 (0.688, 0.266, 0.913), which is significant at the p ≤ 0.001 level. These results reflect a fairly high level of self-insight for the forensic experts. Correlations Between Predictions from AHP and Subjective Models (Within Subjects) The second self-insight analysis, as described previously, is based on a suggestion by Schmitt and Levine (1977). The Spearman correlation between the AHP predicted values for each simulated case and the subjective predicted values for the simulated case were computed as the second means of evaluating self-insight. This
Management Fraud Risk Factors
87
Table 2. Degree of Agreement in Model Weights Across Models (Within-Subjects): Mean Spearman Correlation of Experts’ AHP and Subjective Weights. Subject
Spearman
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
0.580 0.870 0.864 0.788 0.882 0.836 0.885 0.853 0.661 0.619 0.740 0.688 0.661 0.740 0.363 0.623 0.655 0.799 0.589 0.843 0.650 0.719 0.625 0.624 0.769 0.502 0.764 0.616 0.913 0.266 0.551 0.612 0.791 0.894 0.457
Mean Median Minimum Maximum
0.694 0.688 0.266 0.913
Note: Table 2 reports the Spearman correlation between the 25 AHP model weights and 25 subjective model weights for each participant (n = 35). Pearson correlations are essentially identical. All correlations are statistically significant (p ≤ 0.05) except for participants 15 and 30.
88
SALLY A. WEBBER ET AL.
Table 3. Degree of Agreement in Predictions Across Models (Within-Subjects): Mean Spearman Correlation of Predictions from Forensic Experts’ AHP and Subjective Model Weights. Spearman Correlationa
Subject Model 1 3-Values
b
Model 2 2-Valuesc
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
0.710 0.890 0.917 0.812 0.914 0.825 0.911 0.858 0.924 0.799 0.896 0.912 0.629 0.783 0.829 0.764 0.903 0.963 0.903 0.955 0.802 0.821 0.764 0.677 0.867 0.753 0.935 0.823 0.901 0.678 0.842 0.898 0.838 0.911 0.681
0.721 0.890 0.899 0.781 0.912 0.833 0.900 0.855 0.925 0.774 0.879 0.905 0.616 0.796 0.820 0.747 0.894 0.958 0.891 0.963 0.789 0.806 0.774 0.691 0.868 0.747 0.940 0.806 0.909 0.665 0.836 0.892 0.841 0.912 0.650
Mean Median Minimum Maximum
0.837 0.842 0.629 0.963
0.831 0.841 0.616 0.963
Note: Each individual correlation is significant (p ≤ 0.001). a The mean Spearman correlation between the predicted outcomes of 1,000 trials, with predictions based upon 25 AHP model weights and 25 subjective model weights. Pearson correlations are essentially identical. b 3-Values model. The 1,000 random trials assign one of three values to each of the 25 weights: 0.8, 0.5, and 0.2, to represent high, medium, and low risk. c 2-Values model. The 1,000 random trials assign one of two values to each of the 25 weights: 1.0 and 0.0, to represent presence or absence of perceived risk.
Management Fraud Risk Factors
89
result describes consensus-in-predicted values because the correlation measures the degree to which the two models predict the same outcome. Table 3 presents the mean consensus-in-predicted values for each participant for both models. Table 3 reports the mean Spearman correlation for each subject for both 3-values and 2-values models. For both models and for each expert, the individual correlations across the 1,000 trials and 35 forensic experts are statistically significant (p ≤ 0.001). The mean (median, minimum, maximum) Spearman correlation for the 3-values model is 0.837 (0.842, 0.629, 0.963) and is statistically significant (p ≤ 0.001). In the 2-values model, the mean (median, minimum, maximum) Spearman correlation is 0.831 (0.841, 0.616, 0.963), which also is statistically significant (p ≤ 0.001). The high correlations reflect high degrees of consensusin-predicted values. The similarity of the mean, median, minimum, and maximum Spearman correlations in both sets of simulations indicates that the results are not sensitive to the two different ways that the model was operationalized.6
Consensus Among Experts Consensus of Model Weights Across Experts (Between Subjects) Individual correlations for all possible pairs of the 35 participants’ AHP weights were computed, resulting in 595 correlations; then, the process was repeated for the subjective weights. As shown in Table 4, the mean Spearman correlation for the 595 correlations is 0.459 for the AHP weights and 0.634 for the subjective weights, which are both statistically significant at the p = 0.001 level. These mean correlations reflect a moderate to moderately high degree of consensus-in-principle Table 4. Degree of Agreement in Model Weights Across Experts (Between Subjects). Decision Model
AHP Subjective aA
Mean Spearman Correlationa
Number of Correlationsb
Number of Significant Correlationsc
Number of Positive Correlations
0.459 0.634
595 595
405 (68%) 569 (96%)
567 (95%) 595 (100%)
correlation measure for the set of 25 decision weights is calculated for each possible pair of participants. The mean Spearman correlation is computed as the average of the correlation measures for the 595 pairs of participants. It indicates the degree of agreement among the set of raters. Both Spearman correlations are significant at the p = 0.001 level. The results are essentially identical if Kendall’s W is used as an alternative interrater reliability measure. b The number of possible pairings of the 35 participants is calculated as n!/k!(n–k)!. c The number of significant individual correlations (p ≤ 0.05).
90
SALLY A. WEBBER ET AL.
among the forensic experts’ decision models. Further, note that for 595 total subjective model correlations, 96% (569) are statistically significant, which indicates a high degree of agreement. For AHP correlations, 68% of individual correlations are statistically significant, which reflects a moderately high degree of agreement. Prior AHP studies typically have reported low levels of consensus-in-principle (Apostolou & Hassell, 1993; Webber & Hassell, 1997). Consensus in Predictions Across Experts (Between Subjects) Spearman correlations were computed for all possible 595 pairs of the 35 participants’ AHP model predictions, and the process repeated using predictions based upon the subjective models. Table 5 reports the mean Spearman correlation for both the 3-values (0.637) and 2-values (0.662) models. Corresponding Spearman correlations based upon subjective weights are 0.779 and 0.790, respectively. Further, Table 5 reports that all 595 individual correlations are positive and statistically significant (p ≤ 0.05) for both the 3-values and 2-values models. These results reflect moderately high to high consensus across subjects. Note that the power of the tests reported in Table 5 is greater than those reported in Table 4 because the results in Table 5 reflect simulated responses over 1,000 trials.
Comparison of Within-Subjects and Between-Subjects Correlations In a final attempt to assess self-insight, the (within-subjects) mean self-insight Spearman correlations of AHP and subjective model weights (Table 2) were Table 5. Degree of Agreement in Predictions Across Experts (Between Subjects). Decision Model
Mean Spearman Correlationa
Number of Correlationsb
Number of Significant Correlationsc
Number of Positive Individual Correlations
AHP 3-Values 2-Values
0.637 0.662
595 595
595 (100%) 595 (100%)
595 (100%) 595 (100%)
Subjective 3-Values 2-Values
0.779 0.790
595 595
595 (100%) 595 (100%)
595 (100%) 595 (100%)
a The
mean Spearman correlation across predictions based upon experts’ model weights. The results are essentially identical if Pearson correlations are computed. b The number of correlations for every pair of 35 participants’ 25 model weights. c The number of statistically significant individual correlations at p ≤ 0.05 level.
Management Fraud Risk Factors
91
Table 6. Comparison of Within-Subjects and Between-Subjects Correlations. Model Weights
Mean Spearman Correlation
Within-subjects (self-insight correlations) AHP and subjective weightsa
0.694
Between-subjects (consensus-in-principle) Subjective weightsb AHP weights
0.634 0.459
a Mean b Mean
t-Test
Wilcoxon Rank Sum Test
p = 0.0075 p = 0.0001
p = 0.0061 p = 0.0001
of 35 correlations, see Table 2. of 595 correlations, see Table 4.
compared to the (between-subjects) mean consensus Spearman correlations for the AHP weights and the subjective weights (Table 4). If subjects were responding based on an internal decision model, the average within-subjects correlation would be expected to be higher than the average between-subjects correlation because two different models from the same individual who possesses self-insight should be more highly related than those from different individuals. Table 6 presents the results of a t-test (Wilcoxon rank sum test) of the null hypothesis of no difference in mean (median) Spearman correlations. The self-insight Table 7. Comparison of Within-Subjects and Between-Subjects Correlations. Model Predictions (Consensus-inPredicted Values)
Within-Subjects AHP and subjective model predictionsa Between-Subjects AHP model predictionsb Subjective model predictionsb a Mean b Mean
3-Values Model Mean Spearman Correlation
t-Test
2-Values Model Wilcoxon Rank Sum Test
0.831
Mean Spearman Correlation
t-Test
Wilcoxon Rank Sum Test
0.837
0.637
p = 0.0001 p = 0.0001
0.662
p = 0.0001 p = 0.0001
0.779
p = 0.002
0.790
p = 0.003
of 35 correlations, see Table 3. of 595 correlations, see Table 5.
p = 0.003
p = 0.004
92
SALLY A. WEBBER ET AL.
correlation is statistically significantly greater than both the consensus subjectiveweight correlation (t-test p-value = 0.0075; Wilcoxon p-value = 0.0061) and the consensus AHP-weight correlation (both t-test and Wilcoxon p-values = 0.0001). These results reflect a higher degree of agreement between two different models for the same subject than the degree of agreement across subjects for the same model. This finding also supports the assessment that the experts have a high degree of self-insight. Table 7 repeats the analysis using consensus-in-predicted values, and reflects the same results as in Table 6. The Table 6 and Table 7 results are interpreted as reinforcing the conclusion that forensic experts have high selfinsight. The degree of agreement between the experts’ AHP and subjective models was higher than the degree of agreement across experts for either the AHP or subjective models.
CONCLUSION In this study, Apostolou et al. (2001b) was replicated and extended. Apostolou et al. (2001b) provided results showing that three auditor group decision models assessing SAS No. 82 risk factors (internal, external Big 5, and external regional/local) were not statistically significantly different. The replication is that a different subject pool was used with the same AHP modeling approach, although direct comparisons of the AHP models in Apostolou et al. (2001b) were not made. However, the forensic experts and the subjects in Apostolou (2001b) ranked the risk factor categories in the same order, with management characteristics and influence over the control environment most important. The extension is that additional analysis was conducted to examine the subjects’ degree of self-insight related to the AHP modeling and consensus among the subjects. Descriptive information about forensic experts’ AHP and subjective models related to the relative importance of management fraud risk factors is provided. The extant literature on self-insight and consensus is expanded by demonstrating that forensic experts possess relatively high degrees of both. Because self-insight and consensus measures are used to evaluate judgment quality, it appears that forensic experts provide useful data for research into the quality of the management fraud risk assessment. The task in this study differs from that used in many other expert judgment assessments because factors that are explicitly defined by professional standards are used, which may be why higher levels of self-insight and consensus are found. A description of how the AHP and subjective model weights regarding the relative importance of SAS No. 82 factors can be obtained is described. The research technique is relatively easy to apply, and the time commitment of subjects
Management Fraud Risk Factors
93
is reasonable. While it is not appropriate to generalize the results of a single study and make sweeping, global statements, some speculation for what these results could mean for auditing practice is offered. First, because the results support the conclusion that Big 5 forensic experts possess high self-insight, it might be possible for a firm to use the decision weights of their experts for audit training purposes or as the basis for analytical procedures. The decision models of forensic experts could be used for making predictions about the likelihood of management fraud. If a model suggests a high likelihood of fraud, the forensic experts could be brought in as consultants to manage risk exposure on a job. Models could be validated and refined by applying them to actual audit situations. Also, these decision models could be used to design audit expert systems and decision support systems. This aforementioned process could be used in an actual auditing setting as an analytical procedures tool. Over time, a firm could gather evidence about the a priori risk ratings and compare them to actual results, although this task could be difficult because of the low occurrence rate of material management fraud. The prediction scores could then be used more formally in assessments about the likelihood of management fraud. A limitation of the research method used is that the factors were considered individually. It is possible that factors interact such that certain factor groupings may signal an increased likelihood of management fraud. Other limitations result from certain design choices (e.g. allocating 100 points to factors within each factor category rather than 100 points across all 25 factors, or the way the simulation analysis was operationalized). DeZoort (1998) argues that using the 100-point allocation measure for subjective weights unnecessarily creates measurement error. Another limitation is that the data cannot show what an auditor will do with information even if he or she determines that fraud risk is high (i.e. will the follow-up be appropriate to the circumstances). These limitations suggest the need for additional research.
NOTES 1. Statement on Auditing Standards No. 99, Consideration of Fraud in a Financial Statement Audit, was issued by the AICPA’s Auditing Standards Board in October 2002. SAS No. 99 emphasizes professional skepticism, tests for management override of controls, the use of unpredictable audit tests, and requires discussions with management about fraud awareness. A new feature of SAS No. 99 is that the management fraud risk factors are categorized as Incentives/Pressures, Opportunities, and Attitudes/Rationalizations in explicit recognition that these three conditions exist when fraud is present. However, the relative importance of the risk factors is not expressed, which means that the findings of this research continue to be relevant. SAS No. 82 is referenced in this discussion because the research was conducted prior to the issuance of SAS No. 99.
94
SALLY A. WEBBER ET AL.
2. Because this is a replication and extension of Apostolou et al. (2001b), the literature review is abbreviated; see that paper for a more extensive literature review. 3. Reilly and Doherty (1989) asked student subjects to provide subjective weights in a judgment task. Self-insight was operationalized as the ability of a subject to choose his or her own decision model from an available set of all subjects’ decision models. Seven of 11 subjects selected their own decision models, and two other subjects were able to narrow their selection to two decision models, one of which was their model. The authors concluded that subjects had a high degree of self-insight. 4. This study is a follow-up to Apostolou et al. (2001b), which investigated the relative importance of the 25 SAS No. 82 risk factors to (1) 43 field auditors at four of the Big 5 professional service firms; (2) 50 field auditors at four regional/local accounting firms; and (3) 47 practicing internal auditors. Apostolou et al. (2001b) report the AHP weights of relative importance for those three groups. In this paper, the AHP weights of the 35 forensic auditors are not compared to the AHP weights of the three auditor groups in Apostolou et al. (2001b) because the focus is on the forensic auditors. The AHP category weights for the forensic auditors and the auditor groups reported in Apostolou et al. (2001b) are not statistically significantly different, although several significant differences are associated with individual risk factors. 5. Pearson correlation statistics were also computed but are not presented. Inferences throughout the paper do not change if Pearson correlations are used instead of Spearman correlations. 6. To test Schmitt and Levine’s (1977) suggestion that correlations between predicted values resulting from using statistical and subjective weights might reflect more self-insight than the comparison of the statistical and subjective weights, a t-test (Wilcoxon signed rank test) comparing the mean (median) of the two distributions of correlation statistics was conducted. The mean (median) of the distribution of predicted value correlations was statistically significantly greater than the mean (median) of the distribution of self-insight correlations based on comparisons of the decision weights (p = 0.0001 for both tests). Mear and Firth (1987) also found that self-insight measures based upon predictions were higher than self-insight measures based upon models weights for financial analyst subjects.
ACKNOWLEDGMENTS We thank the AICPA, which provided financial support and help in securing the participation of forensic experts, the forensic experts who provided data, and colleagues at the 2001 ABO Conference.
REFERENCES American Institute of Certified Public Accountants (AICPA) (1997). Consideration of Fraud in a Financial Statement Audit. Statement on Auditing Standards No. 82. New York, NY: AICPA. American Institute of Certified Public Accountants (AICPA) (2002). Consideration of Fraud in a Financial Statement Audit. Statement on Auditing Standards No. 99. New York, NY: AICPA.
Management Fraud Risk Factors
95
Apostolou, B., & Hassell, J. M. (1993). An overview of the analytic hierarchy process and its use in accounting research. Journal of Accounting Literature, 12, 1–28. Apostolou, B., Hassell, J. M., & Webber, S. A. (2001a). The importance of management fraud risk factors: Ratings by forensic experts. The CPA Journal (October), 2–7. Apostolou, B., Hassell, J. M., Webber, S. A., & Sumners, G. E. (2001b). The relative importance of management fraud risk factors. Behavioral Research in Accounting, 13, 1–24. Ashton, R. H., & Brown, P. R. (1980). Descriptive modeling of auditors’ internal control judgments: Replication and extension. Journal of Accounting Research, 18, 269–277. Ashton, R. H., & Kramer, S. S. (1980). Students as surrogates in behavioral accounting research: Some evidence. Journal of Accounting Research, 18, 1–15. Beasley, M. S., Carcello, J. V., & Hermanson, D. R. (1999). Fraudulent financial reporting: 1987–1997. Committee of sponsoring organizations of the Treadway commission. Bernardi, R. A. (1994). Fraud detection: The effect of client integrity and competence and auditor cognitive style. Auditing: A Journal of Practice & Theory, 13(Supp.), 68–84. Colbert, J. L. (1988). Inherent Risk: An investigation of auditors’ judgments. Accounting, Organizations and Society, 13, 111–121. DeZoort, F. T. (1998). An analysis of experience effects on audit committee members’ oversight judgments. Accounting, Organizations and Society, 23, 1–21. Einhorn, H. J. (1974). Expert judgment: Some necessary conditions and an example. Journal of Applied Psychology, 59, 562–571. Eining, M. M., Jones, D. R., & Loebbecke, J. K. (1997). Reliance on decision aids: An examination of auditors’ assessment of management fraud. Auditing: A Journal of Practice & Theory, 16(Fall), 1–19. Expert Choice, Inc. (1998). Team expert choice™ user manual. Pittsburgh, PA. Green, B. P., & Choi, J. H. (1997). Assessing the risk of management fraud through neural network technology. Auditing: A Journal of Practice & Theory, 16(Spring), 14–28. Hackenbrack, K. (1993). The effect of experience with different sized clients on auditor evaluations of fraudulent financial reporting indicators. Auditing: A Journal of Practice & Theory, 15(Spring), 99–110. Hamilton, R. E., & Wright, W. F. (1982). Internal control judgments and effects of experience: Replications and extensions. Journal of Accounting Research, 20, 756–766. Hassell, J. M., & Arrington, C. E. (1989). A comparative analysis of the construct validity of coefficients in paramorphic models of accounting judgments: A replication and extension. Accounting, Organizations and Society, 14, 527–537. Heiman-Hoffman, V. B., Morgan, K. P., & Patton, J. M. (1996). The warning signs of fraudulent financial reporting. Journal of Accountancy (October), 75–77. Hooks, K. L., Kaplan, S. E., & Schultz, J. J., Jr. (1994). Enhancing communication to assist in fraud prevention and detection. Auditing: A Journal of Practice & Theory (Fall), 86–117. Landsittel, D. L., & Bedard, J. C. (1997). Fraud and the auditor: Current developments and ongoing challenges. The Auditor’s Report (Fall), 3–4. Loebbecke, J. K., Eining, M. M., & Willingham, J. J. (1989). Auditors’ experience with material irregularities: Frequency, nature, and detectability. Auditing: A Journal of Practice & Theory, 9(Fall), 1–28. Mancino, J. (1997). The auditor and fraud. Journal of Accountancy (April), 32–36. Mear, R., & Firth, M. (1987). Cue usage and self-insight of financial analysts. The Accounting Review, 62, 176–182.
96
SALLY A. WEBBER ET AL.
Nieschweitz, R. J., Schultz, J. J., Jr., & Zimbelman, M. F. (2000). Empirical research on external auditors’ detection of financial statement fraud. Journal of Accounting Literature, 19, 190–246. Palmrose, Z. V. (1987). Litigation and independent auditors: The role of business failures and management fraud. Auditing: A Journal of Practice & Theory (Spring), 90–103. Pincus, K. V. (1989). The efficacy of a red flags questionnaire for assessing the possibility of fraud. Accounting, Organizations and Society, 14(1/2), 153–163. Reckers, P. M. J., & Schultz, J. J., Jr. (1993). The effects of fraud signals, evidence order, and group-assisted counsel on independent auditor judgment. Behavioral Research in Accounting, 5, 124–144. Reilly, B. A., & Doherty, M. E. (1989). A note on the assessment of self-insight in judgment research. Organizational Behavior and Human Decision Process, 44, 123–131. Reilly, B. A., & Doherty, M. E. (1992). The assessment of self-insight in judgment policies. Organizational Behavior and Human Decision Process, 53, 285–309. Saaty, T. L. (1986). Decision making for leaders. Pittsburgh, PA: RWS Publications. Saaty, T. L. (1988). The analytic hierarchy process. Pittsburgh, PA: RWS Publications. Saaty, T. L. (1994). The analytic hierarchy process: Some observations on the paper by Apostolou and Hassell. Journal of Accounting Literature, 13, 212–219. Schmitt, N., & Levine, R. L. (1977). Statistical and subjective weights: Some problems and proposals. Organizational Behavior and Human Performance, 20, 15–30. Slovic, P., Fleissner, D., & Bauman, W. S. (1972). Analyzing the use of information in investment decision making: A methodological proposal. The Journal of Business, 45, 283–301. Slovic, P., & Lichtenstein, S. (1971). Comparison of Bayesian and regression approaches to the study of information processing in judgment. Organizational Behavior and Human Performance, 6, 649–744. Surber, C. F. (1985). Measuring the importance of information in judgment: Individual differences in weighting ability and effort. Organizational Behavior and Human Decision Processes, 35, 156–178. Webber, S. A., & Hassell, J. M. (1997). A comparison of AHP and ANOVA decision modeling techniques in internal control procedure evaluation. Advances in Accounting, 15, 209–242. Zimbelman, M. F. (1997). The effects of SAS no. 82 on auditors’ attention to fraud risk factors and audit planning decisions. Journal of Accounting Research, 35(Supp.), 75–104.
BUDGETARY SLACK CREATION AND TASK PERFORMANCE: COMPARING INDIVIDUALS TO COLLECTIVE UNITS James M. Kohlmeyer III and James E. Hunton ABSTRACT The purpose of this study is to investigate differences between individual and collective budgeting decisions with respect to budgetary slack creation and task performance. While a great deal of research exists in the area of budgeting, to our knowledge, no prior studies have dealt with budget settings in a collective (e.g. small group or cross-functional team) environment. Accordingly, the current study examines differences in slack creation and task performance using a two (decision mode: individual vs. collective decision) by two (incentive contract: slack-inducing vs. truth inducing) between-subjects experimental design. A total of 295 students participated in the experiment (79 individuals and 72 three-person collective units). As expected, individuals and collective decision-makers created significantly more slack under a slack-inducing contract than a truth-inducing contract. Additionally, as anticipated, collective decision-makers created more slack than individuals under a slack-inducing contract. Unexpectedly, however, collective decision-makers created more slack than individuals using a truthinducing contract. Task performance was significantly different between
Advances in Accounting Behavioral Research Advances in Accounting Behavioral Research, Volume 7, 97–122 Copyright © 2004 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1474-7979/doi:10.1016/S1474-7979(04)07005-X
97
98
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
individuals and collective unit members, such that performance of former exceeded latter, as hypothesized. Finally, preliminary analysis indicated that choice shift occurred in the collective units, such that the units became more cautious in setting budget goals than individuals under both incentive contract conditions.
INTRODUCTION Budgets are widely used tools in planning, organizing, directing and controlling the operations of business and governmental entities. While budgets reflect quantitative expressions of predicted performance results, they also serve to motivate subordinate performance (Chow et al., 1988). However, allowing employees to participate in setting their own budget targets provides them with an opportunity to ‘game’ the process and negotiate more easily achievable goals. Budget distortion of this nature often is referred to as budgetary slack (Merchant & VanderStede, 2000). Historically, budgetary slack research has focused on individual decisionmaking (Chow, 1983; Dunk, 1993; Young, 1985). Even though there has been an increased emphasis on teamwork in the workplace (Siegel & Sorensen, 1999), researchers have not examined the effects of collective decisions on the budgeting process. In an attempt to bridge this research gap, the purpose of this study is to examine differences between individual and collective decisions with respect to budgetary slack creation and actual task performance. In social psychology, there are conflicting views regarding the effectiveness of collective vs. individual decision-making. While some research has found that collective units perform better than individuals, other studies have found opposite effects (Isenberg, 1986; Janis, 1982; Rutledge & Harrell, 1994). The impact of incentive contracts on collective decision-making units has not been examined in social psychology or accounting, and the efficacy of a truth-inducing contract as a debiasing mechanism for collective units is uncharted territory as well. Toward this end, this study investigates differential reactions and behaviors of individuals and collective units to slack-inducing and truth-inducing incentive contracts. The current study operationalized a two (decision mode: individual vs. collective decision) by two (incentive contract: slack-inducing vs. truth-inducing) fully-crossed, between-subjects experimental design. A total of 295 students participated in the study. As anticipated, under a slack-inducing, as compared to truth-inducing, contract, individuals and collective units created more budgetary slack. Also as expected, under a slack-inducing contract collective units created more budgetary slack than individuals. Under a truth-inducing contract, however,
Budgetary Slack Creation and Task Performance
99
individuals unexpectedly created significantly less budgetary slack than collective units. Additionally, as hypothesized, the mean performance of collective unit members was significantly less than the mean performance of individuals. Finally, post hoc analysis suggests evidence of choice shift in the collective units, which helps to explain observed budgetary slack differences between individuals and collective units. The current study contributes to extant participative budgeting and slack research along several avenues. First, the differential impact of individuals versus collective decision-makers with regard to budgetary slack creation and actual task performance are examined for the first time in a budgeting context. Second, the mitigating impact of truth-inducing contracts on collective budgeting units is investigated. Finally, this study provides preliminary evidence that choice shift in a collective budgeting environment leans toward caution, not risk. The next section reviews relevant literature and presents study hypotheses. The following sections describe the research method and analyze study results. The final section summarizes the research findings and suggests future research ideas in the area of collective participation in the budgeting process.
THEORY AND HYPOTHESES Over the last four decades, management accounting researchers have been concerned with the issue of budgetary slack in organizations. Managers and their subordinates create budgetary slack when they purposefully build excess resources into their budgets or knowingly understate their productive capabilities (Baiman & Evans, 1983; Young & Lewis, 1995). Budgetary slack has also been described as the express incorporation of budget amounts that are easier to attain (Lukka, 1988; Merchant, 1985; Young, 1985). In this study, budgetary slack is defined as the extent to which participants understate their true productive capabilities. Many studies have found evidence of budgetary slack in organizations (e.g. Cammann, 1976; Kamin & Ronen, 1981; Kirby et al., 1991; Leibenstein, 1979; Lukka, 1988; Merchant, 1985; Merchant & Manzoni, 1989; Onsi, 1973; Schiff & Lewin, 1968; Umapathy, 1987). Having recognized that budgetary slack regularly occurs in business and governmental entities, researchers have focused on understanding why slack occurs, what factors affect its creation and how to minimize its effect. Principally, extant studies have investigated the creation of budgetary slack by individuals engaged in participative budgeting contexts. Shields and Shields (1998) define participative budgeting as a process whereby subordinates are involved with and have influence on the determination of their budgets. Agency theory assumes that subordinates know more than their superiors
100
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
about their tasks and task environments; thus, agency theorists characterize participative budgeting as a means by which superiors attempt to gain private information from subordinates and, as a consequence, reduce uncertainty (Baiman & Evans, 1983; Kirby et al., 1991; Shields & Shields, 1998). Information sharing via participative budgeting allows superiors to design and offer subordinates efficient goal-congruent incentive contracts aimed realizing the subordinates’ true productive capabilities.
Incentive Contracts The study of incentive contracts within the framework of budgetary slack relies primarily on agency theory as its theoretical foundation (Douglas & Wier, 2000). Since agency theory focuses on the relationship between agents and the principals, researchers have suggested that a properly designed reward system may induce subordinates to supply more accurate budgets (Chow, 1983). Unfortunately, setting an appropriate budget can be a major problem when subordinates have better information than their superior concerning their true productive capabilities, and when the subordinates’ pay is based on budgeted performance (Chow et al., 1988). While planning benefits may arise from participative budgeting, an incentive problem is often created. For instance, when a subordinate’s pay increases as budget difficulty decreases, ceteris paribus, the subordinate may bias the communication of private information such that a relatively easy budget is set, thereby creating slack. Without proper incentives to induce truthful communication and motivate best performance, some benefits of participative budgeting to an organization will be lost (Chow et al., 1988). To deal with these problems, researchers have designed and tested two types of incentive contracts – slack-inducing and truth-inducing. Prior studies demonstrate that when subordinates hold private information about their productive capabilities, participate in setting production targets and receive bonuses based on exceeding the production targets, they have an incentive to underestimate their productive capabilities in order to receive larger bonuses (Kirby et al., 1991). Thus, subordinates will typically build slack into budgets that provide opportunities to earn extra compensation, shirk responsibility on the job or both (Young, 1985). Hence, the type of contract that provides a fixed wage plus a bonus for performance exceeding the production target or budget has been defined as slack-inducing. Analytical research in accounting and economics has examined another form of incentive contract, called truth-inducing, which helps to alleviate incentive problems regarding subordinate motivation and communication of private
Budgetary Slack Creation and Task Performance
101
information (Christensen, 1982; Weitzman, 1976). Under certain assumptions, truth-inducing schemes induce subordinates to prefer budgets that are closely aligned with their ‘true’ expected performance. In addition, such schemes provide incentives to maximize performance regardless of the budget (Chow et al., 1988). Truth-inducing contracts have been constructed to solve the problem of misrepresentation of subordinates’ private information because such contracts impose a penalty for such distortion (Libby, 2002). Several empirical tests have compared the amount of budgetary slack created by subordinates rewarded under slack-inducing or truth-inducing incentive contracts. Chow et al. (1988) observed that when information asymmetry was present, slack was significantly lower under truth-inducing scheme in comparison to the slack-inducing scheme. Waller (1988) found that when the truth-inducing scheme was introduced, slack created by risk-neutral participants decreased significantly while slack created by risk-averse participants did not change. Waller and Bishop (1990) reported that overall firm profits were lower and misrepresentations higher under a slack-inducing incentive contract when compared to the truth-inducing scheme. Finally, Chow et al. (2000) tested five different mechanisms for encouraging truthful, upward communication of information within decentralized organizations. Results of the Chow et al. (2000) study indicated that the truthinducing contract led to significantly less misrepresentation by subordinates than did a slack-inducing, linear profit-sharing pay scheme. As previously stated, the budgetary slack literature provides empirical evidence that under conditions of information asymmetry, truth-inducing incentive contracts appear to reduce the amount of budgetary slack when compared to slack-inducing contracts. Accordingly, we do not offer a hypothesis related to how individuals respond to slack-inducing and truth-inducing contracts, as prior findings in this regard are quite robust. However, we include these two conditions in our experimental design for they serve as a baseline against which to compare slack creation in a collective decision-making environment.
Collective Decision Making In recent years there has been increased interest in collective decision making processes (e.g. small groups and teams). Organizations are making more decisions in the context of collective environments rather than placing such responsibilities on individuals. Accordingly, recent changes in business practices are influencing the traditional managerial accounting environment and necessitating a re-examination of prior research efforts (Young & Lewis, 1995). We suggest that similarities in how individuals and collective units respond to incentive contracts
102
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
are rooted in motivational incentives (Geen, 1991). Meaning, a slack-inducing contract encourages the creation of budgetary slack and a truth-inducing contract does not. In this light, we expect that budgetary slack creation by collective units under both types of incentive contracts should be consistent with individual participation research. However, we expect differences between individuals and collective decisionmakers in the magnitude of budgetary slack creation due, in part, to the social cognition that takes place during collective discussion of performance targets (Hackman, 1993; Hunton, 2001; Stasson & Bradshaw, 1995), which enhances all members’ understanding of how to “game” the budgeting process through socially-enhanced procedural knowledge (Wittenbaum & Stasser, 1996). Also, to be discussed in an upcoming section, judgment differences between individuals and collective units are linked to varying risk propensities (Rutledge & Harrell, 1994). We begin by hypothesizing how collective decision-makers will react differently from individuals with respect to slack-inducing and truth-inducing contracts. Slack Differences Between Individuals and Collective Units One of the basic assumptions of agency theory is that individuals will act to maximize their self-interests (Baiman & Evans, 1983). While agency theory does not explicitly discuss collective decision-makers, it suggests that during collective discussion budget participants will also attempt to maximize their self-interests. Schopler et al. (1993) tested the degree to which collective units, as compared to individuals, pursued self-interest acts. They found that collective unit members tended to be more focused on self-interest acts than individuals, and that groups often provided a social support structure that encouraged members to maximize their self-interests. Hence, collective units should “game” the budget process to a greater extent than individuals in an attempt to capitalize on each member’s self-interest. Prior research also suggests that collective units are more competitive than individuals, even in the absence of collective extrinsic rewards (Insko et al., 1990; Schopler & Insko, 1992; Schopler et al., 1991, 1993). Such competitive feelings build during intra-group social processes, particularly collective discussions, which foster an atmosphere of beating the perceived competition (Morgan & Tindale, 2002). Hence, collective discussion of budgetary incentive contracts should engender a collectively-oriented competitive spirit, thereby resulting in collective units attempting to “game” the budgetary process more aggressively than individuals in an effort to win. Further, accountability theory suggests that individuals, as compared to collective units, may be more reluctant to advocate extreme positions or make risky decisions because individuals inherently feel more accountable for their
Budgetary Slack Creation and Task Performance
103
decisions, even in the absence of external accountability mechanisms (Fandt & Ferris, 1990; Frink & Klimoski, 1998; Kroon et al., 1992; Tetlock, 1985). Individual decision-makers’ heightened feelings of accountability arise, in part, because there is no collective unit within which individuals can hide or shirk responsibility (Linden et al., 1999). Collective unit members, on the other hand, tend to feel less personal accountability than individuals; consequently, they often advocate more aggressive collective positions because each member feels that (s)he can hide within the group should (s)he fall short of pulling his/her share of the load (BarNir, 1998; Bateman et al., 1987). Accordingly, the extent of slack that individuals build into their budgets should be less extreme than collective units due to heightened feelings of personal accountability. Lastly, group discussion should result in collective unit members being more fully informed than individuals regarding the merits the incentive contract condition to which they are assigned, primarily because of the enhanced social cognition of collective unit members as compared to individuals (Hunton, 2001; Ono & Davis, 1988; Stasson & Bradshaw, 1995). While there may be wide discrepancies in understanding the advantages of different types of contracts at the individual level, collective units should provide more accurate unified understanding of the incentive contract because collective members bring different skills and capabilities to the decision process. Thus, on average, collective units should better understand the contractual conditions and how to “game” the process to achieve maximum gain. Hence, under a slack-inducing contract, collective units should create more budgetary slack than individuals due to heightened aggressiveness to maximize self-interests, greater propensity to be competitive, lower feelings of personal accountability and enhanced social cognition of how to game the budget-setting process. Regarding the latter issue, collective discussion should acutely reveal that the collective unit members could maximize their compensation by setting a budget well below their actual performance capability. Accordingly, the following hypothesis is presented (alternative form): H1 . The percentage of budgetary slack will be higher in the collective condition, as compared to the individual condition, under a slack-inducing contract. Conversely, collective discussion should clearly reveal that under the truthinducing incentive contract, collective unit members could best maximize their compensation by setting their budget as close as possible to their true productive capabilities. As discussed, collective group members are expected to be more risky or aggressive than individuals in this regard due to higher levels of self-interest, competition and cognition, and lower perceptions of personal accountability. As a result, while both individuals and collective units will attempt to estimate their
104
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
true productive capabilities, individuals should err on the side of under-estimation due to relative conservatism and collective unit members should err on the side of over-estimation due to relative aggression. The expectation of collective units is further supported by choice shift research, which indicates that collective unit members might over-estimate their “true” capability, as the momentum of shift during collective discussion is often more extreme than the initial predisposition of individuals prior to discussion (BarNir, 1998; Blumberg, 1994; Butler & Crino, 1992). Therefore, the following hypothesis is presented (alternative form): H2 . The percentage of budgetary slack will be lower in the collective condition, when compared to the individual condition, under a truth-inducing contract. Performance Differences between Individuals and Collective Units While the central focus of the current study is on the creation of budgetary slack, we also examine performance differences between individuals and collective units. Small group theory posits that, on whole, collective members will perform better than individuals when the group accepts a common goal and has a history of working with each other (e.g. Carless & DePaola, 2000; Littlepage et al., 1997; Mennecke et al., 1992; Stasson & Bradshaw, 1995). However, for ad hoc groups, lack of social attraction and collective responsibility among members can lead to shirking or social loafing (e.g. Geen, 1991; George, 1995; Shepperd & Taylor, 1999). That is, some group members will perform below their true capabilities and “hide” within the group. In the current study, the collective units were comprised of individuals who, prior to the experiment, had no history of working with each other in a collective environment. Hence, we would expect that an ad hoc group of this nature would, on average, display lower performance than individuals due to shirking and social loafing – actions which arise from fairly low intrinsic feelings of personal accountability toward the other group members. Accordingly, the final hypothesis is offered (alternate form): H3 . The mean performance of individuals will be significantly higher than the mean performance of collective unit members.
EXPERIMENTAL METHOD Research Design This study employed a 2 × 2 factorial design, wherein incentive contract (truth-inducing & slack-inducing) and decision mode (individual & collective) were manipulated. The two dependent variables were the percentage of budgetary slack created and actual performance on the assigned task.
Budgetary Slack Creation and Task Performance
105
Measurement of Budgetary Slack The operational definition of budgetary slack in this study is the difference between the participants’ self-reported expected performance after a practice period but before reading the incentive contract manipulation (i.e. Best Estimate of Production) and the participants’ self-set budget (i.e. Final Individual Budget) or the collective units’ jointly set budget (Final Collective Unit Budget) after the incentive contract manipulation (see Fig. 1). The difference is divided by expected performance (i.e. Best Estimate of Production) to normalize the measure and make it comparable across participants with different production capabilities (Stevens,
Fig. 1. Summary of Experimental Procedures.
106
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
2000). The resulting percentage of budgetary slack is then compared among treatment conditions. Measurement of Actual Performance The production task was adapted from Chow (1983), Chow et al. (1988), and Libby (2002). Study participants were provided with a decoding key wherein symbols were randomly assigned to each letter of the alphabet. Then, they were supplied with a list of words that had been coded in accordance with the symbols. The task was to decode as many seven-letter words as possible during a fiveminute performance trial. Performance was measured as the number of correctly decoded words. Covariates Two covariates were examined in this study. First, individual differences regarding the participant’s ability to correctly decode symbols to alphabetic characters was assessed. This performance capability was assessed using a five minute practice session (before they were exposed to the incentive contract). Chow, Cooper and Waller (1988) included performance capability as a covariate because a priori performance capability may affect the amount of budgetary slack. Second, the participants’ risk propensity was assessed because the risk attitude of individuals could affect the amount of budgetary slack they build into their final budgets. Risk propensity was measured by asking participants to respond to the following statement: “I would have to be % sure that I would receive $10 before I would willingly choose the gamble over receiving $5 for sure” (0–100%) (Young, 1985). Description of Incentive Contracts Participants in both individual and collective conditions were given either a slack-inducing or a truth-inducing incentive contract. Each contract provided a payoff table (Appendix) adapted from Libby (2002). The payoff table allowed the participant to calculate the number of raffle tickets she could earn toward the drawings of five $100 cash prizes; as such, performance rewards were directly related to performance outcomes. Both conditions read the following instructions: The vertical axis of the table labeled “Budgeted Number of Words to be Decoded” is your Final Budget. The horizontal axis of the table labeled “Actual Number of Words Decoded” is the actual number of words decoded in the upcoming five-minute work period. You will be given 500 tickets to start the period. In prior periods, your co-workers have been able to decode a minimum of 10 words in the five-minute period. Your supervisor will not accept a Final Budget of less than 10 words.
Budgetary Slack Creation and Task Performance
107
To give you an example of how this earnings contract works, if you chose your budget to be 25 words and then you actually decoded 30 words (actual > budget), the number of tickets that you would earn would be: Slack-inducing contract Tickets earned = 500 (started period with) + 875 (payoff from table) for a total of 1375 tickets. Truth-inducing contract Tickets earned = 500 (started period with) + 450 tickets (payoff from table) for a total of 950 tickets.
A minimum budget of 10 words was required so that participants could not set an unrealistically low budget of zero. Ten words were chosen as the minimum budget because two pilot studies indicated that all participants were able to decode at least ten words in the performance period. In both the individual and collective conditions, the participants earned raffle tickets toward the five $100 cash prizes based on their performance. Materials and Experimental Procedures Experimental sessions were conducted in 75-minute periods for individuals and 90-minute periods for the collective units. The collective unit sessions took 15 minutes longer to allow for group discussion. Participants were given the opportunity to decline to participate in the study. The four treatments were randomly assigned to twenty experimental sessions. A summary of the procedures is next described (see Fig. 1 for a graphical representation). At the beginning of each session, all participants were assembled in a large room designed for behavioral experimentation. There was sufficient space between participants such that they could only see and focus on their experimental packet. An experimenter was in the room at all times to ensure that no interaction took place among participants. After reading a brief introduction and signing consent forms, all participants engaged in a 5-minute practice period. Afterward, participants in both conditions (individual and collective) self-evaluated their performance with a decoding answer key given to them by the experimenter. The experimenter later verified the performance of each participant. Then, the participants gave their Best Estimate of Production regarding the number of words they believed they could accurately decode during another 5-minute session. At this point participants in the individual condition received the incentive contract manipulation (slack-inducing and truth-inducing contracts were randomly assigned to participants during each session). After reading the incentive contract, the individual participants submitted their Final Individual Budget. Afterward, the participants were asked to participate in a 5-minute performance task.
108
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
Participants in the collective condition were randomly assigned to three-person collective units after giving their personal Best Estimate of Production. Controls were built into the experiment to assure that collective unit members could not change their Best Estimate of Production during or after collective discussion. Next, each collective unit assembled in separate rooms that were under visual control of the experimenter. However, the experimenter was not present in the rooms in order to allow for free exchange of information and opinions among collective unit members. Once the participants gathered into their collective units, they read their incentive contract, which was randomly assigned and provided to the collective unit in an envelope while they assembled in their discussion rooms. After a 15minute collective discussion of the incentive contract, the collective unit jointly set a Final Collective Unit Budget. Then members of collective units returned to their seats and completed the task assignment during a 5-minute performance period. After the 5-minute performance period the participants completed an exit questionnaire before a short debriefing. During the debriefing, participants were informed of the following: (1) do not discuss the study with others; (2) sign an agreement to not discuss the study with others; and (3) the specific date when the winners of the $100 cash prizes would be announced.
RESULTS Sample The participants chosen for this study were undergraduate business students enrolled in multiple sections of Principles of Accounting at a large state university located in the southeastern portion of the United States. The students received extra credit points for participation. Additionally, to encourage hard work and maintain a reasonably high level of interest in the study, the students were able to earn raffle tickets based on their performance. Importantly, the number of raffle tickets earned was directly related to the task performance outcome. Five separate prizes of $100 each were awarded to the five raffle drawing winners. A total of 298 students volunteered to participate in the experiment (177 females, 121 males), with a mean (standard deviation) age of 21.55 (2.85) years. Three participants failed to complete one or more necessary experimental measures, leaving a final usable sample of 295 participants. Statistical tests indicated no significant differences across treatment conditions based on age (F = 1.39, p = 0.18) or gender (χ2 = 10.09, p = 0.18). In the individual condition, there were 79 participants, of which 40 performed under a slack-inducing contract and 39 performed under a truth-inducing contract.
Budgetary Slack Creation and Task Performance
109
In the collective condition, 216 individuals comprised 72 three-person collective units. Each three-person collective unit was considered as a single sample observation. Thus, there were 36 observations in each of the two collective treatment conditions (i.e. slack-inducing and truth-inducing incentive contracts). In total, there were 151 independent sample observations in the study.
Manipulation Checks Participants responded to three statements concerning the incentive contract (truth-inducing or slack-inducing) manipulation. One statement read: “The incentive contract I worked under would motivate workers to set their budget at the number of words they could actually decode” (1 = Strongly Disagree, 7 = Strongly Agree). The mean (standard deviation) was 3.47 (2.12) in the slackinducing condition and 5.18 (1.58) in the truth-inducing condition (t = 7.80, p < 0.01). The second statement read: “The incentive contract that I worked under would motivate workers to set their budget below the number of words they could actually decode” (1 = Strongly Disagree, 7 = Strongly Agree). The mean (standard deviation) was 5.11 (2.01) in the slack-inducing condition and 3.71 (1.88) in the truth-inducing condition (t = 6.19, p < 0.01). The final statement asked: “To what extent did your incentive contract affect how you set your Final Budget in comparison to what you really thought you could do?” (1 = Much lower than my expected performance, 5 = Equal to my expected performance, 9 = Much higher than my expected performance). The mean (standard deviation) was 3.85 (2.37) in the slack-inducing condition and 5.38 (1.52) in the truth-inducing condition (t = 6.57, p < 0.01). Based on test results, the incentive contract manipulation was considered successful. Regarding the decision mode manipulation (individual versus collective unit), as discussed earlier, these two conditions were run in separate sessions due to logistical issues and internal validity concerns. Even though an experimenter was present to ensure that individuals did not interact with each other during the sessions and that collective unit members were placed into small rooms where they engaged in collective discussion, all participants responded to a manipulation check item in this regard. Specifically, participants responded to a statement asserting that they were able to interact with other participants during experimental session (1 = Strongly Disagree, 7 = Strongly Agree). The mean (standard deviation) response was 1.42 (1.13) in the individual condition and 6.53 (1.08) in the collective condition (t = 35.78, p < 0.01). Based on the efficacy of the experimental controls and results of the manipulation check item, the decision mode manipulation was deemed successful.
110
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
Preliminary Analyses A MANCOVA model was used to determine statistically significant differences between treatment conditions. The dependent variables were percentage budgetary slack created and number of words decoded during the 5-minute performance period. The independent variables were decision mode (individual vs. collective unit) and incentive contract (slack-inducing vs. truth-inducing). The covariates were the number of words decoded during the 5-minute practice session and the individuals’ (mean collective units’) risk propensity. MANCOVA tests revealed significant results for decision-maker (F = 1160.97, p < 0.01) and incentive contract (F = 17.13, p < 0.01). The two-way interaction was non-significant (F = 0.61, p = 0.54). The first covariate (5-minute practice period performance) was significant with respect to the percentage of budgetary slack (F = 22.03, p < 0.01) and the number of words decoded during the 5-minute performance period (F = 124.76, p < 0.01). The second covariate (risk propensity) was non-significant with respect to budgetary slack (F = 2.62, p = 0.108) and actual performance (F = 1.55, p = 0.15). Thus, only the first covariate was included in the upcoming ANCOVA models.
Hypothesis Testing Hypothesis One The first hypothesis (H1 ) anticipated that under a slack-inducing contract, collective budgetary slack would be higher for collective units, as compared to individuals. The mean percentages of budgetary slack created by collective units (m = 26.83%) and individuals (m = 10.71%) were significantly different based on parametric (F = 30.36, p < 0.01) and non-parametric (t = 5.04, p < 0.02) test results.1 Accordingly, the first hypothesis was supported. Hypothesis Two The second hypothesis (H2 ) stated that under a truth-inducing contract collective units should create lower budgetary slack than individuals. Contrary to expectations, the results shown on Table 1 indicate that budgetary slack was higher for collective units (m = 6.61%) than individuals (m = −4.91%). The planned comparison (F = 14.02, p < 0.01) and non-parametric test (t = 11.12, p < 0.01) indicate that the two means were significantly different. Interestingly, the individuals were more aggressive in setting their budget targets, as indicated by negative slack, than were the collective units. Thus, the second hypothesis was not supported.
Panel A: Results of ANCOVA Testing on the Percentage of Budgetary Slacka Source d.f. Sum-Squares Covariateb Decision-modec Incentive contractd Interaction term Error Total (adj.)
1 1 1 1 146 150
0.75 0.71 1.21 0.02 5.30 7.71
Panel B: Mean (Standard Deviation) Percentage Budgetary Slacka and [Sample Size] Collective Individual Slack-inducing Truth-inducing Main effects
26.83% (20.42%) [36] 6.61% (8.39%) [36] 16.72% (18.32%)c [72]
10.71% (29.93%) [40] −4.91% (14.93%) [39] 2.90% (24.76%) [79]
F-Ratio
p-Value
20.71 19.57 33.27 0.55
0.001 0.001 0.001 0.461
Main Effects 18.77% (26.70%) [76] 0.85% (13.15%) [75] 9.81% (22.68%) [151]
Budgetary Slack Creation and Task Performance
Table 1. Percentage of Budgetary Slack Created.
a Budgetary
slack for the individual [collective] condition is calculated as the expected performance (Best Estimate of Production) of each participant [sum of the expected performances for all collective unit members] prior to receiving the incentive contract [and prior to engaging in collective discussion] minus the Final Budget [Group Budget] of each participant [collective unit] after receiving the incentive contract [and after engaging in collective discussion] divided by the Best Estimate of Production. Least square means are shown. b For the individual [collective] condition, the covariate reflects the number of words decoded [the sum of words decoded for all collective unit members] during the 5-minute practice period. c Decision-maker reflects Individual versus Collective Unit. d Incentive Contract reflects Slack-Inducing versus Truth-Inducing.
111
112
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
Table 2. Number of Words Decoded During the 5-Minute Performance Period. Panel A: Results of ANCOVA Testing on Decoding Performancea Source d.f. Sum-Squares Covariateb Decision-modec Incentive contractd Interaction term Error Total (adj.)
1 1 1 1 290 294
5,666.79 61.69 0.02 21.02 3,537.22 9,389.99
F-Ratio
p-Value
462.99 5.04 0.01 1.72
0.001 0.032 0.971 0.191
Panel B: Mean (Standard Deviation) Decoding Performancea and [Sample Size]e Collective Individual Main Effects Slack-inducing Truth-inducing Main effects
22.19 (4.75) [108] 22.22 (6.21) [108] 22.21 (5.51) [216]
23.40 (6.23) [40] 24.41 (5.61) [39] 23.90 (5.91) [79]
22.52 (5.19) [148] 22.80 (6.11) [147] 22.66 (5.66) [295]
a Independence
among experimental participants is maintained, even for collective unit members, since all participants performed the decoding task in a supervised setting where interaction among participants was not allowed during the 5-minute practice and performance periods. Hence, for individuals and collective unit members, performance reflects the number of words correctly decoded during the 5-minute performance period. Least square means are shown. b For individuals and collective unit members, the covariate reflects the number of words decoded during the 5-minute practice period. c Decision-maker reflects Individual versus Collective Unit. d Incentive Contract reflects Slack-Inducing versus Truth-Inducing. e The expected mean squares of the ANCOVA were adjusted in accordance with Neter et al. (1990) for the unequal sample sizes between treatment conditions.
Hypothesis Three The last hypothesis (H3 ) indicated that the mean performance of individuals would be higher than the mean performance of collective unit group members. Results from ANCOVA testing in this regard are shown on Table 2 (Panel A), as are treatment means (least square), standard deviations and sample sizes (Panel B). As indicted, there is a significant main effect for decision mode, such that the mean performance of individuals (m = 23.90) is significantly greater than the mean performance of collective unit members (m = 22.21). A planned comparison (F = 5.04, p < 0.02) and non-parametric test (t = 4.70. p < 0.02) support the significance of this finding. Accordingly, the third hypothesis is supported. While a difference of 1.69 words (23.90 − 22.21) may not seem large from a practical standpoint, it is important to remember that the performance period lasted only five minutes. If a difference of this nature (7.7% productivity loss) were extrapolated over a longer time period, the performance decrease in the collective condition could have considerable efficiency implications.
Budgetary Slack Creation and Task Performance
113
Supplemental Observation The experimental design was not conducive to collecting a Final Individual Budget from collective unit members (prior to collective discussion) because the incentive contract was revealed to the collective unit as a whole. Hence, asking collective unit members to assess their expected performance after reading the incentive contract (as with participants in the individual decision mode condition) was not possible because independence among participants was violated as soon as they gathered into their collective units. Accordingly, collective unit members were not able to submit pre-discussion budget estimates. However, participants in the individual condition did submit a Final Individual Budget in the absence of collective discussion. Therefore, a precursory analysis of choice shift can be conducted in the current study. Choice shift was measured as the difference between the Final Individual Budget submitted by participants in the individual condition and the average Final Collective Unit Budget of collective unit members (i.e. the Final Collective Unit Budget divided by three members per group). Naturally, this method assumes that the Final Individual Budget serves as a proxy for the collective members’ pre-discussion budget. We recognize the inherent limitations of such an assumption. However, all participants were undergraduate students from the same university whose mean ages and gender proportions were not significantly different across treatment conditions. Additionally, the collective unit participants were randomly assigned to ad hoc groups. Thus, comparing the Final Individual Budget to the average Final Collective Unit Budget may not be entirely inappropriate. The analysis revealed that choice shift seemed to occur within both collective conditions (see Table 3). Within the individual slack-inducing contract condition, there was a mean Final Individual Budget of 16.45 words, whereas the average collective unit budget (per group member) was 12.94 words (t = −3.10, p < 0.01). Within the individual truth-inducing contract condition, there was a mean Final Individual Budget of 19.41 words, while the collective mean budget (per member) was 17.66 words (t = −1.75, p = 0.08). Based on this precursory analysis, there appeared to be a significant cautious choice shift when collective units performed under a slack-inducing contract, such that the collective unit reduced the number of budgeted words to be decoded per unit member, presumably to receive more compensation and reduce performance risk. Additionally, there was a marginally significant cautious choice shift under a truth-inducing contract, although choice shift theory would predict a risky shift in this circumstance. Hence, there is some indication that cautious choice shift occurred during collective budgeting in this study in both incentive contract conditions. Future studies should more rigorously investigate the choice shift phenomenon in the context of collective budget participation.
114
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
Table 3. Mean Choice Shift in Collective Condition. Treatment
Slack-inducing contract Truth-inducing contract
Final Individual Budgeta
Average Collective Budgetb
Choice Shiftc
t-Statistic
16.45 19.41
12.94 17.66
3.51 1.75
3.10 1.75
p-Value <0.01 =0.08
a Represents
mean Final Individual Budget (i.e. the number of words participants believed they could decode during the 5-minute performance period) in the individual condition after receiving their incentive contract, which serves as proxies for the mean collective members’ Final Individual Budget they would have submitted after reading the incentive contract manipulation but before engaging in a collective discussion. b Represents mean Final Collective Unit Budget per collective unit member (i.e. the consensus Group Budget divided by 3 members per unit) after they saw and discussed the incentive contract manipulation. c Difference between the mean Final Individual Budget and mean Final Collective Unit Budget.
DISCUSSION This study was designed to provide insight into how individual and collective budgeting decisions differ with regard to budgetary slack. Specifically, we examined the impact of incentive contracts on budgetary slack with respect to individuals and collective units. Research findings provided empirical evidence for the efficacy of truth-inducing incentive schemes, as compared to slack-inducing schemes, in reducing the amount of budgetary slack in both individual and collective budgeting environments. While this finding has been reported in extant individual budgetary slack literature, the current study extends the investigation of truth-inducing schemes to collective units. A key result of the study is that collective units, on average, created more budgetary slack than the individuals. As expected, under a slack-inducing contract, collective budgetary slack was significantly greater than individual slack. Conversely, it was hypothesized that collective units would record less budgetary slack than individuals under a truth-inducing contract. Surprisingly, collective units still created significantly more budgetary slack than individuals. Apparently, collective units were more cautious in their budget estimates than individuals under both slack-inducing and truth-inducing contracts. This study also investigated actual performance differences between individuals and collective units. Based on small group theories of social attraction, social responsibility and social loafing, we expected that the mean performance of individuals would be higher than the mean performance of collective unit members. Statistical analysis supported this assertion, leading to the suggestion that collective unit members should receive both group-based and member-based performance incentives as a way to ward off potential deleterious effects of shirking or social loafing.
Budgetary Slack Creation and Task Performance
115
Post-hoc analysis of the collective units yielded preliminary evidence of choice shift, which was measured as the difference between the Final Individual Budget (submitted by participants in the individual condition) and the average Final Collective Unit Budget, per unit member. This analysis assumes that Final Individual Budget is a reasonable proxy for the pre-discussion budget that collective unit members would have submitted. Assuming that participants in both conditions were not significantly different from one another in any meaningful way (e.g. ability, cognition and motivation), this study suggests that cautious choice shift appears to have taken place in the collective budgeting environment. While we found preliminary evidence of choice shift, we can only suggest that persuasive arguments theory (Boster & Hale, 1989; Burnstein & Vinokur, 1975; Carnes et al., 1996; Hintz & Davis, 1984; Laughlin & Early, 1982) may help explain why the group members seemed to shift in a cautious direction. Meaning, collective members may have been influenced by the arguments and discussion of a group in deciding on their individual component of the group budget. A future study could examine more specifically what factors influenced members to shift either in a cautious direction.
Contributions Collective participation and decision-making are commonplace in business organizations, yet researchers have focused surprisingly little effort on examining collective decisions, especially in the area of budgetary slack. The current study represents a step forward in this regard by proposing and testing how incentive contracts affect the creation of budgetary slack with respect to collective and individual decision-makers. While prior research provides useful insight into factors affecting the creation of budgetary slack, this study makes some important contributions to extant budgetary slack literature. First, while extant research on budgetary slack has focused on individuals, the current study extends this research thrust to include collective decision-makers. Second, small group research has indicated that members of collective units can be more or less risky than individuals with respect to decision-making. This phenomenon, called choice shift, was tested on a limited basis in the current study and we find that this psychological phenomenon could apply to collective budgeting decision scenarios. Third, this research can assist organizational management in identifying conditions that are likely to cause predictable risky or cautious shifts in the preparation of budgets where collective units are involved. Finally, we observe that actual performance of ad hoc groups is significantly less than individuals, possibly due to social loafing. More research is needed in this area to find ways to mitigate such inefficient behavior.
116
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
Limitations This study is limited by validity threats common to laboratory experiments. For instance, accurate hypothesis guessing may have taken place. To check for this, after every experimental session, the debriefing included a question to participants about the purpose of the study. No subjects were able to correctly guess the hypotheses or the specific purpose of the study. Second, there may have been inadequate preoperationalization analysis of constructs and relationships. To preclude this problem, two pilot tests were conducted on both individuals and collective members to help mitigate possible misunderstanding of instructions and relationships. Refinements to the experimental materials were made based on the pilot tests. Third, since all participants could not be tested at the same time, there was threat that one participant may tell another about the experiment. To preclude this problem, participants were asked to sign an agreement stating that they would not discuss the study with anyone until a specific date, which was after all experimental sessions were scheduled to be completed. Fourth, external validity associated with a laboratory is limited. Because of the contrived nature of a lab experiment and the controlled environment, this study lacks realism, which would be needed to generalize to the general population. However, the purpose of laboratory experimentation is to generalize through theory, which can be done to some extent in this study. Another external validity threat concerns the use of student subjects, rather than business managers who are involved in budgeting processes. While this remains a limitation, a recent study by Hunton (2001) dealing with information exchange within small groups reveals that students and business professionals do not differ with respect to the ways they acquire and interpret socially acquired cognition, such as procedural knowledge. Hence, the use of students in this study may not be entirely void of generalization, particularly with respect to the dynamics of collective discussion. Fifth, individuals were randomly formed into ad hoc groups. This random formation process ignores social complexities inherent in natural groups whose members share a common history. However, the use of ad hoc groups is acceptable in this study for two reasons: (1) the objective was to engage participants in an artificial task in a laboratory setting for theory building purposes; and (2) the experimental task in this study did not require domain-specific expertise. Additionally, because a Final Individual Budget could not be assessed for members of collective units, only tentative conclusions about choice shift can be offered. Finally, conclusions about collective performance are limited because collective members worked on the final task as individuals.
Budgetary Slack Creation and Task Performance
117
Future Research Future research should continue to examine differences between individuals and collective units in creating budgetary slack. Although a few theories have been advanced to explain why collective members tend to be more or less risky in their decisions (e.g. social comparison theory and persuasive arguments theory), more research needs to focus on ways to mitigate the greater slack tendencies of collective units, as compared to individuals. For instance, a natural extension of this study would be a more rigorous analysis of choice shift by examining only collective units. A pre-discussion and post-discussion collective member’s budget estimate could be assessed. Then, choice shift, as well as a complementary phenomenon called group polarization, could be more thoroughly examined. Another study might further investigate the impact of accountability, and other debiasing mechanisms, on collective units with respect to budgetary slack and task performance. A future study might also investigate collective performance by having collective unit members work together on a group task. Another study might compare the impact of individual incentives and group incentives on budgetary slack and performance. Additionally, an analysis of how corporate culture and ethical values affect budgetary slack should be examined. Finally, while budgetary slack research has relied primarily on principal-agency theory, little research has studied the effects of the other factors, such as threat of dismissal or violation of trust between the subordinate and superior, on budgetary slack. Continued emphasis on team decision-making in the business environment begets fruitful research opportunities, as collective decision-making research has been largely ignored in management accounting literature. This research answers the call from Sutton and Hayne (1997) to explore new frontiers in collective (small group) research. Until now, accounting research, especially in management accounting, has focused on individual decision-making. Because teams are commonplace in business today, this study hopes to encourage further work in this rich area.
NOTE 1. Levene’s test, the assumption of equal variance was violated for the performance metric (F = 12.60, p < 0.01) due to a smaller variance in the slack-inducing/collective unit treatment condition, as compared to the other three conditions. Additionally, the equal variance assumption was violated for the percentage of slack created, as truth-inducing contracts displayed significantly less variance than slack-inducing contracts (F = 10.76, p < 0.01). Stevens (2000) noted the latter problem in his experimental studies of budgetary slack. According to Hair et al. (1995), a violation of this assumption has minimal impact
118
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
if the treatment conditions are of approximately equal size (if the largest cell size divided by the smallest cell size is less than 1.5). In this study, the largest cell size was 40 in an individual condition cell, while the smallest cell size was 36 in the collective conditions. Therefore, because the cells were of approximately equal size (largest cell divided by smallest cell = 1.11), the violation is considered minimal. Nevertheless, because a violation of equal variance was noted for both dependent variables, hypotheses are tested using both parametric (ANCOVA and planned comparisons) and non-parametric (Kruskal-Wallis) tests. The normality of the dependent measures was also examined. These tests revealed no serious departures from normality for the percentage of slack or number of words decoded during the 5-minute performance trial.
REFERENCES Baiman, S., & Evans, J. H. (1983). Pre-decision information and participative management control systems. Journal of Accounting Research, 21, 371–395. BarNir, A. (1998). Can group-and issue-related factors predict choice shift? A meta-analysis of group decisions on life dilemmas. Small Group Research, 29(June), 308–338. Bateman, T. S., Griffin, R. W., & Rubinstein, D. (1987). Social information processing and group-induced shifts in responses to task design. Group & Organization Studies, 12(1), 88–109. Blumberg, H. H. (1994). Group decision making and choice shift. In: A. P. Hare, H. H. Blumberg, M. R. Davies & M. V. Kent (Eds), Small Group Research: A Handbook (pp. 141–154). Norwood, NJ: Ablex Publishing Company. Boster, F. J., & Hale, J. L. (1989). Response scale ambiguity as a moderator of the choice shift. Communication Research, 16(August), 532–551. Burnstein, E., & Vinokur, A. (1975). What a person thinks upon learning he has chosen differently from others: Nice evidence for the persuasive arguments explanation of choice shifts. Journal of Experimental Social Psychology, 11, 412–426. Butler, J. K., & Crino, M. D. (1992). Effects of initial tendency and real risk on choice shift. Organizational Behavior and Human Decision Processes, 53, 14–34. Cammann, C. (1976). Effects of the use of control systems. Accounting, Organizations and Society, 301–314. Carless, S. A., & DePaola, C. (2000). The measurement of cohesion in workteams. Small Group Research, 31, 71–88. Carnes, G. A., Harwood, G. B., & Sawyers, R. B. (1996). A comparison of tax professionals’ individual and group decisions when resolving ambiguous tax questions. Journal of American Taxation Association, 18(Fall), 1–18. Chow, C. W. (1983). Providing incentives to limit budgetary slack. Cost and Management (September–October), 37–41. Chow, C. W., Cooper, J., & Waller, W. (1988). Participative budgeting: Effects of a truth-inducing pay scheme and information asymmetry on slack and performance. The Accounting Review (January), 111–122. Chow, C. W., Hwang, R. N., & Liao, W. (2000). Motivating truthful upward communication of private information: An experimental study of mechanisms from theory and practice. Abacus, 36(June), 160–179.
Budgetary Slack Creation and Task Performance
119
Christensen, J. (1982). The determination of performance standards and participation. Journal of Accounting Research (Autumn), 589–603. Douglas, P. C., & Wier, B. (2000). Integrating ethical dimensions into a model of budgetary slack creation. Working Paper. Dunk, A. S. (1993). The effect of budget emphasis and information asymmetry on the relation between budgetary participation and slack. The Accounting Review (April), 400–410. Fandt, P. M., & Ferris, G. R. (1990). The management of information and impressions: When employees behave opportunistically. Organizational Behavior and Human Decision Processes, 15, 405–416. Frink, D. D., & Klimoski, R. J. (1998). Toward a theory of accountability in organizations and human resources management. Research in Personnel and Human Resources Management, 16, 1–51. Geen, R. G. (1991). Social motivation. Annual Review of Psychology, 42, 377–399. George, J. M. (1995). Asymmetrical effects of rewards and punishments: The case of social loafing. Journal of Occupational and Organizational Psychology, 68, 327–338. Hackman, J. R. (1993). A normative model of work team effectiveness. (Technical Report No. 2.) New Haven, CT: Yale School of Organization and Management, Research Program on Group Effectiveness. Hair, J. F., Anderson, R. E., Tatham, R. L., & Black, W. C. (1995). Multivariate data analysis. Englewood Cliffs, NJ: Prentice-Hall. Hintz, V. B., & Davis, J. H. (1984). Persuasive arguments theory, group polarization, and choice shifts. Personality and Social Psychology Bulletin, 10, 260–268. Hunton, J. E. (2001). Mitigating the common information sampling bias inherent in small-group discussion. Behavioral Research in Accounting, 13, 171–194. Insko, C. A., Schopler, J., Hoyle, R. H., Dardis, G. J., & Graetz, K. A. (1990). Individual-group discontinuity as a function of fear and greed. Journal of Personality and Social Psychology, 58, 68–79. Isenberg, D. J. (1986). Group polarization: A critical review and meta-analysis. Journal of Personality and Social Psychology, 1141–1151. Janis, I. L. (1982). Groupthink: Psychological studies of policy decisions and fiascoes (2nd ed.). Boston: Houghton Mifflin. Kamin, J. Y., & Ronen, J. (1981). Effects of budgetary control design on management decisions: Some empirical evidence. Decision Sciences, 12, 471–485. Kirby, A. J., Reichelstein, S., Sen, P. K., & Paik, T. (1991). Participation, slack, and budget-based performance evaluation. Journal of Accounting Research, 29, 109–128. Kroon, M. B. R., van Kreveld, D., & Rabbie, J. M. (1992). Group vs. individual decision making: Effects of accountability and gender on groupthink. Small Group Research (November), 427–458. Laughlin, P. R., & Early, P. C. (1982). Social combination models, persuasive arguments theory, social comparison theory, and choice shift. Journal of Personality and Social Psychology, 42, 273–280. Leibenstein, H. (1979). X-efficiency: From concept to theory. Challenge (September–October), 13–22. Libby, T. (2002). The effect of fair contracting processes on the creation of budgetary slack. Working Paper. Linden, R. C., Wayne, S. J., Judge, T. A., Sparrowe, R. T., Kraimer, M. L., & Franz, T. M. (1999). Management of poor performance: A comparison of manager, group member, and group disciplinary decisions. Journal of Applied Psychology, 84(6), 835–850. Littlepage, G., Robison, W., & Reddington, K. (1997). Effects of task experience and group experience on group performance, member ability, and recognition of expertise. Organizational Behavior and Human Decision Processes, 69, 133–147.
120
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
Lukka, K. (1988). Budgetary biasing in organizations: Theoretical framework and empirical evidence. Accounting, Organizations and Society, 281–301. Mennecke, B. E., Hoffer, J. A., & Wynne, B. E. (1992). The implications of group development and history for group support system theory and practice. Small Group Research, 23(4), 524–572. Merchant, K. A. (1985). Budgeting and the propensity to create budgetary slack. Accounting, Organizations and Society, 201–209. Merchant, K. A., & Manzoni, J. F. (1989). The achievability of budget targets in profit centers: A field study. The Accounting Review (July), 539–558. Merchant, K. A., & VanderStede, W. A. (2000). Ethical issues related to ‘results-oriented’ management control systems. Research on Accounting Ethics, 6, 153–169. Morgan, P. M., & Tindale, R. S. (2002). Group versus individual performance in mixed-motive situations: Exploring an inconsistency. Organizational Behavior and Human Decision Processes, 87(January), 44–65. Ono, K., & Davis, J. H. (1988). Individual judgment and group interaction: A variable perspective approach. Organizational Behavior and Human Decision Processes, 41, 211–232. Onsi, M. (1973). Factor analysis of behavioral variables affecting budgetary slack. The Accounting Review (July), 535–548. Rutledge, R. W., & Harrell, A. M. (1994). The impact of responsibility and framing of budgetary information on group-shifts. Behavioral Research in Accounting, 92–109. Schiff, M., & Lewin, A. Y. (1968). Where traditional budgeting fails. Financial Executive (May), 259–268. Schopler, J., & Insko, C. A. (1992). The discontinuity effect in interpersonal and intergroup relations: Generality and mediation. In: W. Stroebe & M. Hewstone (Eds), European Review of Social Psychology (Vol. 3, pp. 122–151). New York: Wiley. Schopler, J., Insko, C. A., Graetz, K. A., Drigotas, S., & Smith, V. A. (1991). The generality of the individual-group discontinuity effect: Variations in positivity-negativity of outcomes, players’ relative power, and magnitude of outcomes. Personality and Social Psychological Bulletin, 17(6), 612–624. Schopler, J., Insko, C. A., Graetz, K. A., Drigotas, S., Smith, V. A., & Dahl, K. (1993). Individualgroup discontinuity: Further evidence for mediation by fear and greed. Personality and Social Psychological Bulletin, 19, 419–431. Shepperd, J. A., & Taylor, K. M. (1999). Social loafing and expectancy-value theory. Personality and Social Psychology Bulletin, 25, 1147–1158. Shields, J. F., & Shields, M. D. (1998). Antecedents of participative budgeting. Accounting, Organizations and Society, 23, 49–76. Siegel, G. H., & Sorensen, L. (1999). Counting more, counting less: Transformations in the management accounting profession. Practice Analysis study for Institute of Management Accountants. Stasson, M. F., & Bradshaw, S. P. (1995). Explanations of individual-group performance differences: What sort of “bonus” can be gained through group interaction? Small Group Research, 26, 296–308. Stevens, D. E. (2000). Determinants of budgetary slack in the laboratory: An investigation of controls for self-interested behavior. Working Paper. Sutton, S. G., & Hayne, S. C. (1997). Judgment and decision making, Part III: Group processes. In: V. Arnold & S. G. Sutton (Eds), Behavioral Accounting Research: Foundations and Frontiers. Sarasota, FL: American Accounting Association. Tetlock, P. E. (1985). Accountability: The neglected social context of judgment and choice. In: L. L. Cummings & B. Staw (Eds), Research in Organizational Behavior (p. 7). Greenwich, CT: JAI Press.
Budgetary Slack Creation and Task Performance
121
Umapathy, S. (1987). Current budgeting practices in U.S. industry: The state of the art. New York: Quorum. Waller, W. S. (1988). Slack in participative budgeting: The joint effect of a truth-inducing pay scheme and risk preferences. Accounting, Organizations and Society, 87–98. Waller, W. S., & Bishop, R. A. (1990). An experimental study of incentive pay schemes, communication, and intrafirm resource allocation. The Accounting Review, 65, 812–836. Weitzman, M. (1976). The new Soviet incentive model. Bell Journal of Economics (Spring), 251–257. Wittenbaum, G. M., & Stasser, G. (1996). Management of information in small groups. In: J. L. Nye & A. M. Brower (Eds), What’s Social About Social Cognition? (pp. 3–28). Thousand Oaks, CA: Sage. Young, S. M. (1985). Participative budgeting: The effects of risk aversion and asymmetric information on budgetary slack. Journal of Accounting Research (Autumn), 829–842. Young, S. M., & Lewis, B. (1995). Experimental incentive contracting research in management accounting. In: R. H. Ashton & A. H. Ashton (Eds), Judgment and Decision-making Research in Accounting and Auditing. Cambridge, NY: Cambridge University Press.
APPENDIX Incentive Contract Payoff Tables Slack-inducing payoff table:
122
JAMES M. KOHLMEYER III AND JAMES E. HUNTON
Truth-inducing contract table:
BUDGET TEAM GOALS AND PERFORMANCE ANTECEDENT AND MEDIATING EFFECTS Peter Chalos, Margaret Poon, Dean Tjosvold and W. J. Dunn III ABSTRACT Organizations rely on budget teams for capital investment decisions. This study examined conditions that affected budget team performance. Variables included the formulation of cooperative, competitive and independent team budget goals and the mediating effect of budget information analysis between goals and budget performance. Two antecedents to budget goal formulation were examined, the budget knowledge of individual team members and organizational feedback control. Posited hypotheses were supported. Asymmetric budget knowledge between team members significantly increased independent and competitive budget goals and decreased cooperative budget goals. Organizational controls discouraged independent and competitive goals and encouraged cooperative budget goals. Cooperative (competitive and independent) budget goals improved (hindered) budget information analysis that in turn positively (negatively) affected budget performance.
Advances in Accounting Behavioral Research Advances in Accounting Behavioral Research, Volume 7, 123–152 Copyright © 2004 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1474-7979/doi:10.1016/S1474-7979(04)07006-1
123
124
PETER CHALOS ET AL.
INTRODUCTION Management theorists have proposed that teams are an important source of social and intellectual capital to the firm (Barney, 2001; Nahapiet & Ghoshal, 1998). As virtually all Fortune 500 firms use teams in capital investment decisions (Lawler et al., 1995), understanding precisely how capital budget teams add investment value to the firm is an important but unresolved issue. Recent accounting studies have begun to consider the role of management control systems within team structures (Chalos & Poon, 2000; Drake et al., 1999; Hunton, 2001; Scott & Tiessen, 1999). The implicit assumption in the design of budget team functions is that group decisions improve firm performance but creating consistently effective teamwork has proven to be elusive for organizations (Katzenbach & Smith, 1993). Organizational and team process failures frequently lead to sub-optimal performance (Cohen & Bailey, 1997). Recent research suggests that the goals of budget team members are fundamental to performance. Budget goals galvanize effort, direct attention and encourage persistence and strategic development. A review of the budget literature suggests that goals of budget teams and the organizational controls that support these goals are powerful sources of performance motivation (Alper et al., 1998; Chalos & Poon, 2001) but a fundamental problem of budget cooperation stems from the fact that autonomous member goals may adversely affect effort and performance. The present study extended this line of research by addressing several issues. The effects of cooperative, competitive and autonomous team member budget goals upon performance, mediated by budget analysis, were examined. The impact of two budget antecedents upon capital budget goals was also analyzed, team member knowledge and organizational controls. Budget knowledge asymmetries between cross-functional team members were hypothesized to foster greater competition and independence and less cooperation in budget goal formulation. Organizational controls were speculated to increase cooperative budget goals and decrease independent and competitive goals. Cooperative goals were posited to improve budget analysis by the capital budget team that in turn was hypothesized to positively affect performance. The results corroborated the hypotheses and emphasized the importance of individual member knowledge and organizational controls to cooperative goal formulation, information sharing and performance. The paper proceeds as follows. The literature related to goal theory in budget teams is first reviewed. Theoretical linkages are developed between team member budget knowledge, organizational controls, cooperative, competitive and independent budget goals, information processing and performance. The methods section describes the test instrument, budget teams and process of establishing budget goals at the organizational site chosen for the study. This is followed by
Budget Team Goals and Performance Antecedent and Mediating Effects
125
reliability and validity checks of the findings, the partial least squares analysis and a discussion of the results. The discussion interprets the findings in the context of the extant literature and proposes an agenda for future research.
HYPOTHESIS DEVELOPMENT Budget Goals A significant body of research has established that budget goals at the individual level are important regulators of managerial action (for recent reviews, see Chalos & Poon, 2001 or Shields & Shields, 1998). Over a decade of budget research confirms that budget goals motivate individuals but is silent with respect to budget teams (Anthony & Govindarajan, 1998). Structural innovations, such as flat organizational structures and work groups involve greater decentralization of delegated decision rights and the removal of barriers between organizational activities. Such seamless organizational linkages appear to be at odds with traditional divisional budget goals and profit centers. Issues of information dissemination and coordination, incentives and performance evaluation in capital budget teams within organizations remain unexplored areas of management accounting research. Ouchi (1980) originally argued that markets (bureaucracies) are more efficient when goal incongruity is high (low) and performance ambiguity is low (high). Informal groups or “clans” are more suitable when goal incongruity is low and performance ambiguity is high, suggesting that budget teams may suffer in situations of member goal conflict. In their review of teams in organizations, Guzzo and Dickson (1996, p. 314) concluded that: “Evidence is clear. Compared with the absence of goals, specific and difficult goals for groups raise group performance, but reports of failures suggest that the degree of cooperation and communication affect performance.” The manner in which project managers believe their budget goals are related becomes an important variable affecting the dynamics and outcomes of their interaction as members of a capital investment team. Alper et al. (1998) and Tjosvold and Tjosvold (1994) and others identified three alternatives of team member interpretation of goal interdependence: cooperation, competition and independence. When personal goals of team members are compatible, attraction to organizational goals increases. A fundamental problem of budget cooperation stems from the fact that cross-functional managers in budget teams have only partly overlapping goals. Managers often pursue incongruent objectives. As organizational capital is rationed between divisions, unless inter-divisional performance incentives exist, cooperation is unlikely. Cross-functional budget team members may believe that
126
PETER CHALOS ET AL.
their goals are competitive by considering that one member’s goal attainment precludes the goal attainment of others, a zero sum win-lose approach. Since managers may pursue incongruent objectives with uncoordinated effort, budget teams need to recognize the intersection of member goals. Capital budget teams with perceived competitive goal interdependence may conclude that when some team members gain resources under capital budget rationing, other team members lose resources. Mistrust restricts information and resource exchange and distorts communication between team members. When individual budget goals are more salient than the group goal, the motivation to work together to improve group budget performance decreases (Mitchell & Silver, 1990). Budget goal independence occurs when team members believe that their goals are unrelated. Team members with perceived goal independence conclude that it means little to them if others act effectively or ineffectively. Budget team members without joint incentives may not communicate and exchange resources (Drake et al., 1999). Goal independence may lower team efficiency. A cooperative budget goal creates interdependence because individual motives aroused by the presence of the goal can be satisfied only when the group performs well. The group budget goal links individual goals because each individual’s satisfaction depends on the group’s success (Scott & Tiessen, 1999).
Asymmetric Knowledge Organizations institute capital budget teams that involve a number of functional areas. Budget teams provide a forum for the exchange and combination of tacit (implicit) as well as codified (explicit) knowledge of engineering, design, cost, production, logistics and marketing. Team member diversity is however frequently accompanied by knowledge asymmetry. While Guzzo and Dickson (1996) found that teams with greater member heterogeneity and with more access to internal and external firm knowledge improved performance, many organizational teams suffer “learning disabilities” (Senge, 1990). Unarticulated tacit knowledge as well as explicit knowledge must be captured. Studies of information sampling show that teams are particularly ineffective at identifying and pooling specialized knowledge possessed by individual members (Wittenbaum & Stasser, 1996). Group discussion is dominated by common knowledge, while asymmetric information not shared by members has less influence (Stasser & Titus, 1985). Research on teams suggests that at least two members of a team must have information in order for the team to adopt and implement an idea. Otherwise, social pressure within the team usually results in exclusion of knowledge adoption. The bias against the dissemination of unique information possessed by a sole member of a budget team presents
Budget Team Goals and Performance Antecedent and Mediating Effects
127
a formidable threat to the effectiveness of cross-functional capital budget teams (Hunton, 2001). To elicit and organize budget knowledge, an effective informational architecture must be designed to overcome organizational boundaries. In Japanese budgeting, overlapping and redundant team member knowledge and communication patterns create common ground and understanding. This helps to form behavioral alliances and support systems for shared ideas and cooperative goals that might otherwise be ignored (Nonaka & Takeuchi, 1995). Although it has been suggested that knowledge asymmetry between budget team members might be linked to competitive and independent budget goals (Alper et al., 1998), no studies to date appear to have considered this in their analysis of goals and team performance. Capital budget teams with common knowledge should be more inclined to set cooperative goals. Interpersonal budget knowledge possessed by many members of the group should decrease independent and competitive behavior. Conversely, when knowledge is asymmetrically distributed between team members, each member may be reluctant to disclose information that is inconsistent with their own goal maximization. In such cases, autonomous, and possibly competitive, budget goals might be fostered. Agency theorists have consistently found that managerial information asymmetry leads to goal incongruity unless the incentive system between principal and agent aligns their mutual interests. Managers with access to private knowledge bias budgets and set individual budget goals that do not maximize the welfare of the organization (Dunk, 1993). Based on the empirical evidence to date, it is hypothesized that: H1a . Independent team member budget goals are positively associated with knowledge asymmetry between budget team members. H1b . Cooperative team member budget goals are negatively associated with knowledge asymmetry between budget team members. H1c . Competitive team member budget goals are positively associated with knowledge asymmetry between budget team members.
Organizational Controls Managing teams requires appropriate controls (Simons, 1995). An organization can develop more cooperative budget goals through the appropriate use of management controls. Top down management is commonly thought to impede teamwork compared to the bottom up approach thought to encourage local autonomy and knowledge. However, it may be difficult to share distributed knowledge
128
PETER CHALOS ET AL.
and coordinate effort with a bottom up approach. Confronted with independent and competitive goals, firms frequently rely on hierarchical controls to reduce fragmentation and strengthen coordination. Ouchi (1980, p. 135) has argued that bureaucratic intervention is needed to promote managerial and employee cooperation “when goal incongruity and performance ambiguity are high.” Organizational controls are useful for restricting individuals from pursuing their individual budget goals at the expense of the firm’s overall capital budget goals. Simon (1981), noting the presence of hierarchy in all organizational systems, has argued that vertical hierarchy is efficient and robust against distortions of the cybernetic goals of the organization. Agency theorists in analytical budget research have argued that hierarchical control can be efficient and effective in promoting cooperation among managers and employees (Villadsen, 1995). Malone (1987) concluded that hierarchy decreases coordination and vulnerability costs associated with risk. Mookherjee and Reichelstein (1997, p. 147) further demonstrated that hierarchies are: “effective in creating incentives and coordinating decisions . . . firms do not necessarily become more efficient by eliminating hierarchical layers.” Tirole (1986) has argued that even flatter matrix organizations are characterized by informal hierarchical structures. Budget teams are commonly designed to solve decisions lower in an organizational hierarchy as an addendum to existing functional structures (Galbraith, 1994). Henderson and Lee (1992) compared design teams across organizations and found that the highest performing teams were those in which upper level managers retained control over members in developing tasks. Similarly, Kim and Lee (1995) found that a lack of budget controls had a negative association with performance for research and development teams. Levi and Slem (1995) found that self-managed teams did not perform as well as managed teams. Giroux et al. (1986) examined vertical and horizontal power relationships in the establishment of budgets and found that power over budget formulation was often centralized in a vertical hierarchy. Power was decentralized only for budget implementation. They proposed that hierarchy reduces uncertainty and improves performance. Eisenhardt and Tabrizi (1995) found that team goal clarity improved as management intervened in the process. Autonomous teams performed poorly. Managers appreciated the ability of upper level management to clarify, process, and provide technical assistance. Guzzo and Dickson (1996, pp. 326–327) in their review of team effectiveness concluded that: “There is substantial variance in research findings regarding the consequences of autonomous work groups (and that) the performance relationship between organizational intervention and team goals is not well understood. Future research is needed to clarify how organizational structure relates to team
Budget Team Goals and Performance Antecedent and Mediating Effects
129
goals.” Cohen and Bailey (1997, p. 26) in their team review article concluded more bluntly: “Autonomy is neither desired nor beneficial to project teams.” The organizational context in which a capital budget team operates clearly moderates or accentuates its effectiveness. Highly interdependent, problem-solving groups need organizational direction and support in order to be effective. Management control hierarchy helps to reinforce organizational budget goals. Scott and Tiessen (1999) found that in team structures the interaction between activity accounting and rewards based on group incentives was associated with cooperative innovations, lower costs and higher profits. Drake et al. (1999) also found that incentive schemes affected team budget cooperation. An organization united behind the strong influence of top management provides clear guidance and clarifies the goals and duties of project teams. Vertical hierarchy is likely to help group members believe that they have common, cooperative goals as articulated and reinforced by top management and fewer competitive and independent goals. With disinterested top management, more dispirited and less committed team members may focus on individual or competitive goals that they believe have a higher probability of success and recognition than more cooperative team goals. Team members may also perceive the organizational climate as more fragmented and competitive. Accordingly, the following hypotheses are examined: H2a . Independent team member budget goals are negatively associated with control hierarchy. H2b . Cooperative team member budget goals are positively associated with control hierarchy. H2c . Competitive team member budget goals are negatively associated with control hierarchy.
Information Analysis Budget teams process information selectively on the basis of goals and objectives. Unless a budget team has a shared goal, information analysis is likely to be ineffective. A high degree of team member cooperation is not a guarantee of efficient information processing (Young & Selto, 1993). Rather what information is selected and the degree of information sharing is important. Team members working under cooperative goals have been found to share more task relevant information, pay more attention to the ideas of others and experience fewer communication difficulties than group members working under competitive or
130
PETER CHALOS ET AL.
individualistic goals (Johnson & Johnson, 1989). Teams in a cooperative goal setting that share information through constructive controversy have also been found to achieve greater performance (Alper et al., 1998). Conversely, studies have found that competitive goals restrict information processing and resource exchange and distort communication. Independent goals induced an indifference to the interests of other team members and led to fewer incentives to communicate and exchange information. While some researchers have suggested that individualistic reward structures promote higher achievement (Hayes, 1976), Johnson and Johnson (1989) in their review of studies of goal structures concluded that cooperative goals promoted greater information processing and achievement than individualistic goals. Generally, independence has been found to have similar though not as strong negative effects on group interaction and productivity as competition. Based on the above findings, the following hypotheses are posited: H3a . Budget information processing is negatively associated with independent team member goals. H3b . Budget information processing is positively associated with cooperative team member goals. H3c . Budget information processing is negatively associated with competitive team member goals.
Performance Researchers have long focused on information processing associated with team performance within organizations. A significant body of literature exists related to group information processing and performance (Bettenhausen, 1991; Chalos & Pickard, 1985; Laughlin & Hollingshead, 1995). Findings conclusively suggest the importance of information processing to decision quality and that a primary reason for teams to participate in a process was to create and share information. Coordination of efforts, even through conflict, improves the processing of information (Stasser et al., 1995) along with reduced cognitive constraint in teams with interpersonal knowledge (Carver & Scheier, 1981). In their study of budget teams in a manufacturing setting, Banker et al. (1996) demonstrated that information processing in teams plays a crucial role in both output quality and labor productivity. The manner in which team budget information, ideas and cognitions are processed strongly affects outcomes (Ickes & Gonzalez, 1994). In their literature
131
Fig. 1. Causal Model of Performance.
Budget Team Goals and Performance Antecedent and Mediating Effects
132
PETER CHALOS ET AL.
review of team effectiveness, Campion et al. (1993) found cooperation and information sharing to be significantly correlated with team productivity and effectiveness. Teams processed information on the basis of objectives, tasks, missions and collective goals. Team effectiveness varied as a function of what information was shared as well as the degree of information sharing (Mackie & Goethals, 1987). Based on the above, it is hypothesized that: H4 . Budget performance is positively associated with team information processing. The hypotheses are summarized in Fig. 1.
RESEARCH METHODS Sample of Respondents Most research examines teams within the context of a single organization (Cohen & Bailey, 1997). For this study, a large multinational firm (US$2.5 billion in net assets) that made extensive use of capital budget teams offered to participate. The firm had experienced a 13.2% growth rate during the most recent decade. In an effort to assure continued productivity gains and growth, during the past several years the firm had undergone significant restructuring with specific emphasis on organizational teams. In the words of its president: “The management control system is now aligned with our objective of developing an organizational structure with emphasis on teamwork and enhanced performance.” For the study, division heads volunteered a total of 190 individuals from 60 capital project teams. The organizational structure of the firm was broken down into strategic development, human resources, finance and three business groups. Each business group consisted of between eight to fourteen departments. These departments were responsible for the development of capital investment budgets. Meeting the revenue, productivity and cost goals of these capital budgets was vital to the performance of each business unit. Interviews with department heads revealed that capital budgets included engineering projects, systems hardware and software investments, distribution upgrading, environmental improvements and property, plant and equipment expenditures. Individual project expenditures typically ranged from ten to twenty million dollars. As these cumulative annual investments amounted to hundreds of millions of dollars for the firm, senior management was particularly interested in formally evaluating the success of the firm’s capital budgeting process.
Budget Team Goals and Performance Antecedent and Mediating Effects
133
Capital budgeting requests and subsequent post-investment audits were performed within departments by designated budget teams. Since the capital budget task involved considerable specialized knowledge, judgment and expertise, teams were cross-functional, drawing their membership from different areas within the departments. All teams had worked together for a minimum period of six months. When a project was finished, team members either moved on to the next project or else returned to their functional unit. Budget projections required technical and financial knowledge and expertise of budget team members to justify the cost and revenue projections required of each capital request. While team members collectively had a high degree of cost, engineering and technical knowledge, individual members sometimes lacked the financial and operational knowledge needed for planning due to their diverse departmental backgrounds. Each team consulted with a different project operations officer or budget consultant within their division assigned to the project who provided financial and operational knowledge and advice as well as information to team members on an ad hoc basis. Project budget consultants, although not formally a part of project teams, met periodically with the teams to provide budget planning regarding cost and revenue line item assumptions, past history of item budget accuracy, variances, operational links to the financial budget, resource availability, and capacity. Management strongly encouraged accurate and cooperative goal setting from its budget teams and discouraged independent or competitive goal setting that would prevent the firm from obtaining its objectives. The formulation and development of accurate budgetary goals required detailed information of individual project team managers. The private information and agendas of individual team members were sometimes considered an obstruction to cooperation. Interviews with team members and more direct observations of team meetings indicated that within teams, the goals of individual members did not always coincide. Individual team member’s private agenda, motivation, and interests were displayed during team deliberations. A project for example that might not directly benefit a team member’s area of responsibility might not receive the needed information, commitment and ultimately member endorsement that it might otherwise receive. The result was that independent goals of individual team members sometimes took precedence over those of the team. All budget projects were subject to hierarchical divisional review and approval. In order to develop accurate and realistic budgets, several levels of divisional management communicated with budget teams on an ongoing basis. While upper level management monitored team member involvement and acceptance of responsibility, management did not directly participate in project team deliberations. Upper level management feedback to budget teams indicated what
134
PETER CHALOS ET AL.
assumptions needed additional justification, what to look at further in the analysis and where to attain needed capital improvements. Team budgets were initially discussed with the head of the division for further revision and review. Budget requests were then submitted to the division executive committee for additional scrutiny, feedback and approval. The finance committee then reviewed the requests with further suggestions and the division directors gave final project approvals. Although hierarchical feedback controls slowed the process, management felt that it improved team budget performance. Flatter decision structures were considered to be unstable, broader in information consideration, too autonomous and more prone to riskier and poorer performance decisions. In discussions with management, it was clear that organizational reviews of project teams were essential to the approval and eventual acceptance of budget proposals. A project manager supervised the ongoing deliberations of each budget team to ensure that the team project deliberations were effective and efficient. As each capital budget was on a timeline and part of an overall expenditure budget, it was essential that each capital budget team function with due deliberation, dispatch, and accountability. The performance of each project team was formally considered by departmental division heads in evaluations at the conclusion of each project and was included in the semi-annual performance evaluation process of each manager. Although not a direct participant in the team process, the division head met with individual team members as well as the team on a regular basis throughout the project in order to assess project progress. Interviews with division heads suggested that successful budget teams paid close attention to coordination of project activities, investigation of operational problems, forecast accuracy of costs and demand, negotiation with suppliers, and supervision of project personnel. Without adequate performance in these areas, division heads thought that their budget team representatives would not develop credible budget proposals.
Test Instrument Administration The administration of the test instruments took place over an 18-month period. Test instruments were administered to different respondent groups to maintain the independence of the variable observations and to avoid possible common method bias attributable to the same manager responding to both independent and dependent variables. Once a team had completed its budget proposal, a firm member contacted the researchers who then administered a test instrument to each of the respondent groups. The team respondents for the study included
Budget Team Goals and Performance Antecedent and Mediating Effects
135
190 managers from 60 budget teams. Four test instruments with missing values were subsequently deleted, leaving 56 fully completed team surveys for analysis. Each team member evaluated the budget goal characteristics of the team’s capital budget project. A twelve item scale representing cooperative, competitive and independent budget goals was adapted from previous goal studies (Alper et al., 1998; Tjosvold & Tjosvold, 1994). Individual team member responses were statistically averaged to represent the team response in order to minimize possible response bias associated with a single team interactive scaling of the goals. Hierarchical control was measured by asking four upper levels of management to rate their influence on the project teams (Giroux et al., 1986). Levels of hierarchical review were based upon preliminary discussions with management and the firm’s organizational chart and included the head of the business group, the senior executive committee, the finance committee, and the Board of Directors. Each of the four levels of upper management independently evaluated its respective influence upon the budget proposal that the division project team submitted for review. These responses were statistically averaged to represent the influence of hierarchical control on each budget team. Based on their responses to team member requests for advice and observations of shared team member knowledge during project team deliberations, budget project consultants were aware of the amount of distributed operational and financial knowledge that was held among team members. Similar to prior studies of asymmetric budget knowledge between managers (Dunk, 1995; Shields & Young, 1993), team budget consultants assessed the amount of distributed knowledge possessed by team members of: technical processes of capital budgets; production input and output estimates related to operational aspects of the capital budget; revenue and cost projections of the capital budget; labor scheduling for the budget, suppliers; and financial aspects of the budget. A project manager assigned to each team evaluated the information processing skills of the team. Based on Guzzo and Dickson (1996), project team managers were asked to scale the budget information processing that they observed between team members. This included the extent to which the budget team members: expressed their own views; shared financial and technical information; functioned as a cohesive unit; were open minded and supportive of each other; and actively and constructively considered available budget information. Using the Mahoney et al. (1965) index of budget performance that has been used in dozens of prior budget studies (see Chalos & Poon for a review 2001, pp. 197–198), each division head evaluated the performance of the budget team in his/her division. An overall measure of performance was included to cross-validate the summative scores of the eight dimensions of performance. The performance items include budget planning, investigation, coordination, evaluation, supervision, staffing, negotiation and
136
PETER CHALOS ET AL.
Table 1. Survey Items of Variable Constructs. Variable/Respondent Sample
Variable Construct Items
Source (Adapted)
Organization controls/upper level management
Evaluate the relative influence of the following parties in the budgetary process: (1 = very little influence; 7 = very much influence) • Board of directors • Finance committee • Senior executive committee • Division head
Firm organizational chart, Giroux, Mayper and Daft (1986)
Team budget knowledge/budget consultants
Assess the amount of knowledge possessed by team members: (1 = knowledge widely known and distributed; 7 = knowledge not widely known and distributed) • Technical processes of the capital budget • Production input and output estimates related to the operational aspects of the capital budget • Revenue and cost projections of the capital budget • Detailed financial understanding of all aspects of the capital budget project • Sources of supply for the capital budget • Labor scheduling for the capital budget
Dunk (1995), Shields and Young (1993)
Goalsa /team members
Indicate how well each statement describes how your budget team members work together. (1 = strongly disagree; 7 = strongly agree) • My budget team members feel that we are in the budget outcome together • What helps my budget team members, sometimes gets in my way • Each budget team member does his own thing • My budget team members feel that we are on the same side • My budget team members have a win-lose relationship • My budget team members like to be successful through their individual work • My budget team members sink or swim together
Alper, Tjosvold and Law (1998), Tjosvold and Tjosvold (1994)
Budget Team Goals and Performance Antecedent and Mediating Effects
137
Table 1. (Continued ) Variable/Respondent Sample
Variable Construct Items
Source (Adapted)
• My budget team members like to show that they are superior to me • My budget team members work for their own interests • People in my budget team want each other to succeed • The goals of my budget team members are incompatible with each other • Budget team members are unconcerned with what each manager wants to accomplish Information processing/project managers
Please indicate how the budget team interacts when addressing capital budgeting issues: (1 = strongly disagree; 7 = strongly agree) • Budget team members express their own views • Budget team members share financial and technical information with each other • The budget team functions as a cohesive and functional unit • Budget team members are open minded and supportive of each other • Relevant budget information is considered by the budget team
Guzzo and Dickson (1996)
Performance/division heads
Please rate the performance of the capital budget team on the following tasks: (1 = below average; 7 = above average) • Planning • Investigating • Coordinating • Evaluating • Supervising • Staffing • Negotiating • Representing • Rate your overall performance
Mahoney, Jerdee and Caroll (1965)
a Questions 1, 4, 7and 10 represented cooperative goals. Questions 2, 5, 9 and 12 represented competitive
goals. Questions 3, 6, 8 and 11 represented independent goals.
138
PETER CHALOS ET AL.
representation. The survey items representing each variable construct are summarized in Table 1.
RESULTS Respondent Checks Useable responses of fully completed surveys were received from 177 team respondents, representing 56 teams. These included 13 two-member teams, 26 three-member teams, 7 four-member teams, and 9 five-member teams. Demographic statistics of the team respondents as well as survey variables are included in Table 2. Debriefing information indicated that the average respondent was 41 years of age, with mean experience at the firm of 12 years. The majority of the team respondents were male (94%). Capital budget projects varied between 3 and 36 months in duration and ranged in expenditure amount from US$3 to $27 million. To test for possible project and respondent effects on team responses, ANOVAS were run on each of the demographic variables between the three goal levels. The Fmax statistic indicated insignificant heterogeneity of variance. None Table 2. Descriptive Statistics. Panel A Survey Constructs (n = 56)
Mean
Std. Dev.
Min.
Max.
Inter-Rater Reliability
Alpha Coefficient
5.18 4.53 3.53 3.67 5.12 5.19 4.77
0.90 0.67 1.33 1.02 0.89 0.95 0.75
2.3 2.9 1.4 1.6 1.8 1.0 1.6
7.0 6.3 5.7 6.8 6.8 6.7 6.2
0.75 0.79 0.81 0.68 0.74 0.82 0.83
0.79 0.87 0.77 0.83 0.85 0.78 0.93
2.49 3.12 3.66 2.78 4.12 3.59 7.91
Respondent Variables (n = 177)
Mean
Std. Dev.
Min.
Max.
Age Years experience Project duration (months) Project amount ($ millions) Sex M = 94% F = 6%
41.3 12.2 15.1 16.4
4.2 4.6 4.7 5.1
26 2 3 2.6
58 23 36 27.2
Organization controls Team budget knowledge Competitive goals Independent goals Cooperative goals Information processing Performance Panel B
Budget Team Goals and Performance Antecedent and Mediating Effects
139
of the demographic variables was statistically significant between the three goal constructs. Nor were project duration, project value, respondent age, sex and work experience significant as covariates when run with each of the three goal constructs independently against information processing. Because team size varied, an ANOVA was also run within each of the goal constructs with team size as a factor. Using Scheffe’s conservative a posteriori test of all possible mean differences between the different team sizes, the overall F value did not approach conventional significance (Winer, 1993). Nor was team size significant as a covariate when run with each of the three goal constructs independently against information processing. Accordingly, unweighted sums of squares were used for the three goal levels scaled by the teams in the partial least square analysis.
Method Bias Several precautions were taken to minimize the likelihood of method bias. First, to lessen recall bias, each set of respondents filled out an evaluation at the conclusion of a project. Immediate feedback lessened the likelihood of memory recall bias. Second, each respondent sample (upper level management, project consultants, project managers, division heads) independently evaluated a budget team based upon their ongoing project interaction with that team. None of the respondent samples was a member of the project teams. Third, although the sample of respondents at each level in the hierarchy of organizational controls was not the same, the division heads constituted the first of the four hierarchical levels and also performed the performance evaluations. Possible response bias was checked in the following manner. Internal reliability tests of the organizational control hierarchy variable were performed with and without division heads. Factor loadings, alpha coefficients and confirmatory fit indices were all higher with the inclusion of division head responses in the organization control variable. Tests of organization control hypotheses were also run with and without the division heads as part of the control variable. The sign and level of significance of organization control results remained unchanged. Based on these results, division heads were retained in the organizational measurement. Table 2 includes descriptive statistics of team responses on each variable construct. As explained, these responses represented the mean of the individual responses for each team. This is a common metric reported in the team literature (Alper et al., 1998; Guzzo & Dickson, 1996) designed to avoid the bias associated with a single interacting team response. Team inter-rater reliability was measured by an intra-class correlation coefficient (I.C.C., Shrout & Fleiss, 1979). I.C.C. (I, k) is the lower bound estimate of the mean rater reliability based on k
140
PETER CHALOS ET AL.
team members. Across constructs, these coefficients ranged from 0.68 to 0.83, indicating high inter-rater reliability for each construct.
Construct Validity Construct validity was measured in several ways. First, as explained, extensive field observations, interviews and pilot tests within the divisions of the organization were undertaken to ensure external validity. These observations served as a basis for determination of hypothesized variable relationships of a project team process model of performance. Based on these observations, only proxy variable measures with high reported item reliability of the hypothesized model variables were selected from previous studies. Second, the sample size (n = 56) was too small relative to the recommended minimum of 100–200 observations and the number of survey items (i.e. indicators) too large relative to the sample size to obtain reliable measures of fit using confirmatory factor analysis. In such cases, exploratory factor analysis and inspection of factor loadings to confirm the adequacy of items and scales used as indicators is preferable. Factor loadings of the first principal component explained 43.52% of the variance. The remaining components explained less than 10% each of the variance, indicating construct uni-dimensionality. Squared row loadings represent the squared correlation of the factor with each survey item in that row. Factor loadings below 0.50 should be treated with caution. The smallest loading in this sample was 0.601, indicating that the variable construct items fairly represented the constructs. A third test of construct validity measured the internal reliability of each construct in the sample. A varimax factor rotation confirmed the factors represented by the survey items. A varimax rotation yielded five distinct variable constructs, consistent with the survey variable constructs. Only a single eigenvalue greater than one loaded on each factor. Eigenvalues ranged from 2.49 to 7.91 across the five variable constructs and are reported in Table 2. Alpha coefficients ranged from 0.79 to 0.93, comfortably above minimum requirements (Cronbach, 1951). The overall performance metric scaled by respondents correlated at 0.95 (p < 0.00) with the summative measure of performance. Fourth, several checks of goal construct validity were performed. The factor loadings for the independent and competitive goal survey items were uniformly high with negative coefficient signs while the factor loadings for the cooperative goal survey items were uniformly high with positive coefficient signs. A confirmatory factor analysis was run on each goal construct separately and on all the goal items together (AMOS/SPSS). Goodness of fit indices were
Budget Team Goals and Performance Antecedent and Mediating Effects
141
significantly higher for individual constructs than all constructs together. The alpha coefficients and eigenvalues of each construct as reported in Table 2 reflect their unique components. These results confirmed the uni-dimensionality of the goal variable constructs.
Partial Least Squares Analysis The covariance – correlation matrix of survey items is included in Table 3. The results confirmed all of the directional hypotheses specified by the model in Fig. 1. Asymmetric budget knowledge between team members correlated positively with independent and competitive goals and negatively with cooperative goals. Organizational controls correlated positively with cooperative goals and negatively with independent and competitive goals. Cooperative goals correlated positively while competitive and independent goals correlated negatively with information processing. Information processing correlated positively with performance. A Partial Least Squares (PLS) model using UNIPALS (Glen et al., 1989) and SAS-STAT 12.0 (1999) was derived to analyze the hypothesized bivariate construct relationships posited by the model in Fig. 1. The PLS procedure fits predictive Table 3. Sample Covariance Matrix and Correlations in Parentheses. Team Organization Independent Cooperative Competitive Information Budget Controls Goals Goals Goals Processing Knowledge Organization controls Independent goals
−0.050 (−0.084) 0.228 (0.339)**
Cooperative goals
−0.144 (−0.244)*
Competitive goals
0.210 (0.356)*
−0.254 (−0.281)* 0.274 (0.346)**
−0.486 (−0.546)**
−0.151 (−0.191)
0.619 (0.697)**
−0.282 (−0.362)**
Information processing
−0.051 (−0.080)
0.091 (0.113)
−0.583 (−0.612)**
0.632 (0.751)**
−0.311 (−0.384)**
Performance
−0.046 (−0.091)
0.122 (0.152)
−0.296 (−0.401)**
0.388 (0.614)**
−0.132 (−0.202)
∗ Significant
@ p < 0.05 (1 tailed). @ p < 0.01 (1 tailed).
∗∗ Significant
0.354 (0.505)**
142
PETER CHALOS ET AL.
partial least squares models with one set of predictors and one set of responses. Unlike factor analysis or regression, PLS explains both response and predictor variation. Predictor variation in our case is due to reflective indicators, quantified aspects of previously tested theory. Reflective indicators are invoked in an attempt to account for observed variances and covariances. Principal component-like factors (i.e. latent variables) of these indicators are extracted to explain predictor variation. Reduced rank regression is used to extract factors to explain response variation. The structural equations or inner relations are defined as: E(|, ) =  × +
(1)
= ␥
(2)
=
(3)
where and are vectors of unobserved criterion and explanatory variables respectively,  is a matrix of coefficient parameters for and is a matrix of coefficient parameters for . The unobserved variables and are linear combinations of their empirical indicators ␥ and respectively. PLS minimizes the residual variance of the latent variable structural equations and the residuals of the measurement model.
Hypothesis Tests The PLS estimates are included in Table 4 and Fig. 2. PLS extracts successive linear combinations of the predictors or latent constructs that optimally explain response and predictor variation. The outer relation loadings in Table 4 are equivalent to correlation coefficients between the particular indicator, the survey item and the latent variable, the construct. All of the coefficients are expressed in standard form (mean centered and scaled to unit variance) so as to compare their relative strengths. As can be seen, most of the questionnaire items loaded highly on their respective constructs. The low negative coefficients of Q4 of the organization control variable represent the division head’s lack of influence on higher organization controls as discussed above. The negative coefficients of competitive goals → information processing and independent goals → information processing represented the negative effect of competitive and independent goals respectively on team information processing. All of the bivariate regression equations and their slopes, as defined in Eq. (1) were significant (@ p < 0.05) and in the posited direction with adjusted R2 s ranging from 0.07 to 0.82. H1 posited that budget knowledge asymmetry between
Organization controls → competitive goals Organization controls → cooperative goals Organization controls → independent goals Budget knowledge → competitive goals Budget knowledge → cooperative goals Budget knowledge → independent goals Competitive goals → information processing Cooperative goals → information processing Independent goals → information processing Information processing → performance
% Variance
Q1
Q2
Q3
Q4
Q5
Q6
Adjusted R2
F-Value
Slope t-Value
62.16 61.23 59.97 47.53 47.11 45.17 48.47 77.75 67.92 55.70
0.57 0.69 0.71 0.28 0.28 0.17 0.01 0.47 −0.46 0.55
0.66 0.50 0.45 0.41 0.36 0.01 −0.63 0.55 −0.37 0.46
0.46 0.47 0.50 0.41 0.34 0.28 −0.47 0.48 −0.60 0.51
−0.17 −0.22 −0.21 0.26 0.24 0.18 −0.61 0.49 −0.53 0.09
– – – 0.50 0.63 0.80 – – – 0.47
– – – 0.51 .47 0.47 – – – –
0.09 0.18 0.07 0.21 0.13 0.15 0.44 0.82 0.46 0.27
6.51** 13.27** 5.18* 15.67** 8.87** 10.89** 43.65** 248.27** 47.35** 21.84**
2.55** 3.64** 2.28* 3.96** 2.98** 3.30** 6.61** 15.76** 6.88** 4.67**
Note: None of the remaining latent variables were significant @ p < 0.05. The loadings in Table 4 are equivalent to correlation coefficients between the particular indicator, the survey item, and the latent variable, the construct. All of the coefficients are expressed in standard form so as to compare their relative strengths. ∗ Significant @ p < 0.05. ∗∗ Significant @ p < 0.01.
Budget Team Goals and Performance Antecedent and Mediating Effects
Table 4. Partial Least Square Questionnaire Item Loadings of the Independent Upon the Dependent Variables Represented in Fig. 1 (n = 56) in Descending Order from Table 1.
143
144 PETER CHALOS ET AL.
Fig. 2. Partial Least Squares Regression Coefficients.
Budget Team Goals and Performance Antecedent and Mediating Effects
145
team members increased independent budget goals (H1a ) and competitive budget goals (H1c ) and decreased cooperative budget goals (H1b ). This hypothesis was confirmed. Knowledge asymmetry significantly increased independent goals (t = 3.30; p < 0.01) and competitive goals (t = 3.96; p < 0.01) and decreased cooperative goals (t = 2.98; p < 0.01). As indicated in Fig. 2, the bivariate regression coefficient between budget knowledge and independent goals was R 2 = 0.15 (F = 10.89; p < 0.01) between budget knowledge and competitive goals R 2 = 0.21 (F = 15.67; p < 0.01) and between budget knowledge and cooperative goals R 2 = 0.13 (F = 8.87; p < 0.01). H2 posited that organization controls would discourage independent (H2a ) and competitive budget goals (H2c ) and encourage cooperative budget goals (H2b ). This hypothesis was corroborated for independent (t = 2.28; p < 0.05); competitive (t = 2.55; p < 0.01) and cooperative goals (t = 3.64; p < 0.01). As indicated in Fig. 2, the regression coefficient from organization controls to independent goals was R 2 = 0.07 (F = 5.18; p < 0.05); from organization controls to cooperative goals R 2 = 0.18 (F = 13.27; p < 0.01); and from organization controls to competitive goals R 2 = 0.09; F = 6.51; p < 0.01). H3 posited that independent (H3a ) and competitive budget goals (H3c ) would hinder budget team information processing and that cooperative budget goals (H3b ) would improve budget information processing. This hypothesis was confirmed for independent budget goals (t = 6.88; p < 0.01), cooperative budget goals (t = 15.76; p < 0.01) and competitive budget goals (t = 6.61; p < 0.01). As indicated in Fig. 2, the regression coefficients were R 2 = 0.46 (F = 47.35; p < 0.01) for independent budget goals; R 2 = 0.82 (F = 248.27; p < 0.01) for cooperative budget goals and R 2 = 0.44 (F = 43.65; p < 0.01) for competitive budget goals. H4 was also corroborated. Information processing strongly affected budget performance (t = 4.67; p < 0.01), with a regression coefficient of R 2 = 0.27 (F = 21.84; p < 0.01).
DISCUSSION In a business environment characterized by rapid technological change, effective capital budget teams are designed to respond to competitive challenges in a timely fashion. Budget teams can provide organizations with a sustainable competitive advantage but team organization linkages are reciprocal and complex. Results of this study underscore the fact that structuring effective cross-functional budget teams is not a simple task. Managers with diverse expertise and knowledge did not always cooperate. Asymmetrically distributed knowledge between budget team members may foster competitive and independent budget goals. The results of this
146
PETER CHALOS ET AL.
budget team study suggest that prior findings of ineffective knowledge integration by budget teams (Chalos & Poon, 2000) and low transmittal of specialized knowledge in teams (Wittenbaum & Stasser, 1996) may be due in part to private knowledge not shared by team members. Knowledge transmitted between team members appears to lead to more cooperative goal setting that in turn leads to better information analysis. The results corroborate recent speculation (Alper et al., 1998) that knowledge asymmetry between team members is linked to competitive and independent goals. Without an incentive to develop cooperative budget goals, project teams may not be as effective as anticipated. When budget team members have little new knowledge to contribute, it may be more advisable to work independently. Conversely, when cross-functional team members have high domain specific knowledge, mechanisms within the capital budget team must be instituted to encourage the distribution of this information. Findings from this study corroborate budget inefficiencies found in selfmanaged budget teams operating under flat organizational structures (Cohen & Bailey, 1997; Levi & Slem, 1995). Results indicate that hierarchical controls may discourage independent and competitive budgetary team member goals. These results are consistent with theorizing that managerial control, when implemented in a progressive fashion, may encourage cooperation and coordination of activities (Mookherjee & Reichelstein, 1997; Tirole, 1986; Villadsen, 1995). Highly autonomous cross-functional teams appear to need direction and support from upper level management. This is analogous to overly decentralized and autonomous divisional units of organizations that are too loosely controlled. Organizational controls help to identify the context that supports super-ordinate budget goals. Greater task clarity and delineation of responsibilities may motivate team members to work cooperatively and less independently. When knowledge asymmetries exist between team members, controls encourage cooperative goal formulation. This study suggests that hierarchical control may at times be an important antecedent to cooperative budget goal behavior. The results further suggest that cooperative budget goals improve team information processing. There was a significant decrement in information sharing under independent and competitive budget goals. The positive relationship between budget team member information processing and performance is consistent with prior findings in this area (Banker et al., 1996; Laughlin & Shupe, 1996; Stasser et al., 1995). The results of this study suggest that cooperative goals foster better information exchange and improved budget performance. Results of this study counter the common assumption that vertical budget controls inevitably undermine teamwork. The relationship between organizational controls and teamwork is apt to be more complicated than this study’s
Budget Team Goals and Performance Antecedent and Mediating Effects
147
results suggest. Previous theorizing suggests that controls can at times frustrate coordination. Managerial researchers have sometimes argued that control losses occur in hierarchies (Gilbert & Riordan, 1995; McAfee & McMillan, 1995) and that hierarchies lead to commitment to prior actions, ignorance of environmental information and information distortion (Sutcliffe, 1994). Budgetary forecast errors for example may become magnified when aggregated across successive hierarchical levels. Firms such as IBM, GM and Philips have learned that rigid hierarchy stifles innovation and creativity. Indeed, there may be a curvilinear relationship where too little control results in the fragmentation of independent and competitive goals but so do strong controls, especially when imposed in an autocratic manner.
Limitations Since firm specific budget data is proprietary and confidential due to competitors, budget research is customarily performed via attitudinal surveys. Survey data has limitations that include possible response bias, measurement reliability and internal and external validity issues. In this study several precautions were taken to lessen these problems. External construct validity was sought through repeated field observations, interviews and pilot tests within the divisions of the organization. All teams were “in situ” and had worked together for a minimum period of six months. Constructs were drawn from the literature and statistical tests were performed to ensure internal reliability and the uni-dimensionality of each construct. Test instruments were administered to different respondent groups to maintain the independence of the variable observations and to avoid possible common method bias. To lessen recall bias, each set of respondents filled out an evaluation at the conclusion of a project and team inter-rater reliability was measured by an intra-class correlation coefficient. While longitudinal studies of teams may yield different results, the majority of capital budget teams are transitory in nature due to managerial turnover. Nor does our causal model of team inputs and outputs rule out alternative causal relationships. For example, it may be argued that cooperative goals induce feelings of knowledge symmetry and the acceptance and appreciation of controls by organizational superiors. Indeed, team members may use their information sharing as a reason for believing their goals are cooperative. Although we believe that previous theorizing and research, as well as the data, support our directional model, simultaneous and reciprocal models of budget team processes cannot be ruled out. The findings were derived from one organizational setting. While this limits any generalization of the conclusions to other firms, there is no a priori reason to
148
PETER CHALOS ET AL.
suspect that team processes are firm specific. As the measures are self-reported, they may be biased (Spector, 1992). Yet recent evidence indicates that people often accurately perceive their social environment (Balzer & Sulsky, 1992; Murphy et al., 1992). Further, dependent and independent variables were measured between, not within, respondent groups.
Future Research Effective budget teams help organizations channel and coordinate their capital resources. This study’s findings provide guidance for developing effective teams but budget teamwork remains a socially complex process. Applying this knowledge to develop effective budget teams remains a challenge. Budget team members cannot be expected to cooperate fully merely because they have been placed in a team structure. Appropriate conditions must be created. The development of sound personnel controls and information transmission should alleviate, if not eliminate, dysfunctional competitive budget goals. Cooperative budget team goals coupled with sound information analysis may be the basis upon which teams can fulfill their potential for intellectual capital development, innovation, and strategic advantage. In addition to replications with different operations and samples, future research is needed to address several unresolved issues. The question of exactly what makes teams effective should be directly addressed by: research of budget information; types of managerial control relative to cooperative and independent goals; team boundaries; and alternative incentive structures. Many of these variables are fundamental to effective organizational teamwork. As this study suggests, information is often unevenly distributed among team members. The relationship between budget controls, team management and incentives remains poorly understood, and while cooperative team goal formulation is critical to budget information sharing, the development of cooperative goals in project teams remains elusive. Knowledge asymmetry and incentive structure may lead to independent not cooperative goals. Research is needed to identify how cooperative goals can be developed under conditions of knowledge asymmetry. In addition to investigating the relationship between budget controls and teamwork, future research is also needed to specify the particular ways that social controls can be useful. Spector and Brannick (1995) have argued that the most effective way to overcome recall and other methodological weaknesses is to test ideas with different methods. It would be desirable to provide direct experimental verification of the role of interdependence and interaction on team effectiveness in budget settings. The effects of intervention at one level, individual, team or organizational, may reside at another level. Multiple simultaneous influences on and by teams
Budget Team Goals and Performance Antecedent and Mediating Effects
149
occur. The prevalence of capital budget teams makes this a fertile ground for future research.
ACKNOWLEDGMENTS The constructive comments of workshop participants at the American Accounting Association Management Accounting Conference and workshop participants at the University of Illinois and City University of Hong Kong are gratefully acknowledged. The research assistance of Eva Dunn and John Bowes is also acknowledged.
REFERENCES Alper, S., Tjosvold, D., & Law, K. S. (1998). Interdependence and controversy in group decision making: Antecedents to effective self-managing teams. Organizational Behavior and Human Decision Processes, 74(1), 33–52. Anthony, R. N., & Govindarajan, V. (1998). Management control systems (9th ed.). NJ: Irwin/McGrawHill. Balzer, W. K., & Sulsky, L. M. (1992). Halo and performance appraisal research: A critical examination. Journal of Applied Psychology, 77, 975–985. Banker, R. D., Field, J. M., Schroeder, R. G., & Sinha, K. K. (1996). Impact of work teams on manufacturing performance: A longitudinal field study. Academy of Management Journal, 39(4), 867–890. Barney, J. B. (2001). Is the resource-based “view” a useful perspective for strategic management research? Yes. Academy of Management Review, 26, 41–56. Bettenhausen, K. L. (1991). Five years of group research: What we have learned and what needs to be addressed. Journal of Management, 17(2), 345–381. Campion, M. A., Medsker, G. J., & Higgs, A. C. (1993). Relations between group characteristics and effectiveness: Implications for designing effective work groups. Personnel Psychology, 46, 823–850. Carver, C. S., & Scheier, M. F. (1981). Attention and self-regulation: A control theory approach to human behavior. New York: Springer-Verlag. Chalos, P., & Pickard, S. (1985). Information choice and cue use: An experiment in group information processing. Journal of Applied Psychology, 70(4), 634–641. Chalos, P., & Poon, M. (2000). Capital budgeting performance in project teams. Behavioral Research in Accounting, 16, 123–152. Chalos, P., & Poon, M. (2001). Participative budgeting and performance: A review and re-analysis of prior research. Advances in Management Accounting, 10, 57–79. Cohen, S. G., & Bailey, D. E. (1997). What makes teams work: Group effectiveness research from the shop floor to the executive suite. Journal of Management, 23(3), 239–290. Cronbach, L. J. (1951). Coefficient alpha and internal structure of tests. Psychometrika, 16(3), 297–334. Drake, A. R., Haka, S. F., & Ravenscroft, S. P. (1999). Cost system and incentive structure effects on innovation, efficiency and profitability in teams. The Accounting Review, 74(3), 323–345.
150
PETER CHALOS ET AL.
Dunk, A. S. (1993). The effect of budget emphasis and information asymmetry on the relation between budgetary participation and slack. The Accounting Review, 68(2), 400–410. Dunk, A. S. (1995). The differential effect of information asymmetry on the relation between budgetary participation and departmental performance. Advances in Management Accounting, 4, 147–161. Eisenhardt, K. M., & Tabrizi, B. N. (1995). Accelerating adaptive processes: Product innovation in the global computer industry. Administrative Science Quarterly, 4, 84–110. Galbraith, J. R. (1994). Competing with flexible lateral organizations (2nd ed.). Reading, MA: Addison-Wesley. Gilbert, R., & Riordan, M. (1995). Regulating complementary products: A comparative analysis. Rand Journal of Economics, 26(2), 243–256. Giroux, G. A., Mayper, A. G., & Daft, R. L. (1986). Organization size, budget cycle and budget related influences in city governments: An empirical study. Accounting, Organizations and Society, 11(6), 499–519. Glen, W. G., Dunn, W. J., III, Sarker, M., & Scott, D. R. (1989). UNIPALS: Software for principal components analysis and partial least squares regression. Tetrahedron Computer Methodology, 2, 377–396. Guzzo, R. A., & Dickson, M. W. (1996). Teams in organizations: Recent research on performance and effectiveness. Annual Review of Psychology, 47, 307–338. Hayes, L. (1976). The use of group contingencies for behavioral control: A review. Psychological Bulletin, 83, 628–648. Henderson, J. C., & Lee, S. (1992). Managing I/S teams: A control theory perspective. Management Science, 38(6), 757–777. Hunton, J. (2001). Mitigating the common information sampling bias inherent in small group discussion. Behavioral Research in Accounting, 13, 121–137. Ickes, W., & Gonzalez, R. (1994). Social cognition: From the subjective to the inter-subjective. Small Group Research, 25, 294–315. Johnson, D. W., & Johnson, R. T. (1989). Cooperation and competition: Theory and research. Edina, MN: Interaction Book Company. Katzenbach, J. R., & Smith, D. K. (1993). The wisdom of teams. Cambridge, MA: Harvard Business School Press. Laughlin, P. R., & Hollingshead, A. B. (1995). A theory of collective induction. Organizational Behavior and Human Decision Processes, 61, 94–107. Lawler, E. E., III, Morhman, S. A., & Ledford, G. E., Jr. (1995). Creating high performance organizations: Practices and results of employee involvement and total quality management in Fortune 1000 companies. San Francisco: Jossey-Bass. Levi, D., & Slem, C. (1995). Team work in research – and – development organizations: The characteristics of successful teams. International Journal of Industrial Ergonomics, 16(1), 29–42. Mackie, D. M., & Goethals, G. R. (1987). Individual and group goals. Review of Personality and Social Psychology, 8, 144–167. Mahoney, T. A., Jerdee, T. H., & Caroll, S. J. (1965). The jobs of management. Industrial Relations, 4(2), 97–110. Malone, T. W. (1987). Modeling coordination in organizations and markets. Management Science, 33(10), 1317–1332. McAfee, P., & McMillan, J. (1995). Organizational diseconomies of scale. Journal of Economics and Management Strategy, 399–426.
Budget Team Goals and Performance Antecedent and Mediating Effects
151
Mitchell, T. R., & Silver, W. R. (1990). Individual and group goals when workers are interdependent: Effects on task strategies and performance. Journal of Applied Psychology, 58, 185–193. Mookherjee, D., & Reichelstein, S. (1997). Budgeting and hierarchical control. Journal of Accounting Research, 35(2), 129–155. Murphy, K. R., Jako, R. A., & Anhalt, R. L. (1992). Nature and consequences of halo error: A critical analysis. Journal of Applied Psychology, 78, 218–229. Nahapiet, J., & Ghoshal, S. (1998). Social capital, intellectual capital, and organizational advantage. Academy of Management Review, 23, 242–266. Nonaka, I., & Takeuchi, H. (1995). The knowledge-creating company. New York, Oxford: Oxford University Press. Ouchi, W. (1980). Markets, bureaucracies, and clans. Administrative Science Quarterly, 25, 129–141. SAS/STAT Software (1999). PROC PLS. Release 6.1 of the SAS System. Cary, NC: SAS Institute. Scott, T. W., & Tiessen, P. (1999). Performance measurement and managerial teams. Accounting, Organizations and Society, 24(3), 107–125. Senge, P. M. (1990). The fifth discipline: The age and practice of the learning organization. London: Century Business. Shields, M., & Shields, D. (1998). A review of budget participation: Antecedents and consequences. Accounting, Organizations and Society, 4(15), 212–228. Shields, M., & Young, M. (1993). Antecedents and consequences of participative budgeting: Evidence on the effect of asymmetrical information. Journal of Management Accounting Research, (Fall), 265–280. Shrout, P. E., & Fleiss, J. L. (1979). Intra-class correlations: Uses in assessing rater reliability. Psychological Bulletin, 86, 377–386. Simon, H. (1981). The sciences of the artificial (2nd ed.). Cambridge, MA: MIT Press. Simons, R. (1995). Levers of control: How managers use innovative control systems to drive strategic renewal. Boston: Harvard Business School Press. Spector, P. E. (1992). A consideration of the validity and meaning of self-report measures of job conditions. In: C. L. Cooper & I. T. Robertson (Eds), International Review of Industrial and Organizational Psychology (pp. 123–151). Chichester: Wiley. Spector, P. E., & Brannick, M. T. (1995). The nature and effects of method variance in organizational research. In: C. L. Cooper & I. T. Robertson (Eds), International Review of Industrial and Organizational Psychology (pp. 249–274). Chichester: Wiley. Stasser, G., Stewart, D. D., & Wittenbaum, G. M. (1995). Expert roles and information exchange during discussion: The importance of knowing who knows what. Journal of Experimental Social Psychology, 31, 244–265. Stasser, G., & Titus, W. (1985). Pooling of unshared information in group decision making: Biased information sampling during discussion. Journal of Personality and Social Psychology, 48, 1467–1478. Sutcliffe, K. M. (1994). What executives notice: Accurate perceptions in top management teams. Academy of Management Journal, 37(5), 1360–1378. Tirole, J. (1986). Hierarchies and bureaucracies: On the role of collusion in organizations. Journal of Law, Economics and Organizations, 2, 181–214. Tjosvold, D., & Tjosvold, M. M. (1994). Cooperation, competition, and constructive controversy: Knowledge to empower self-managing teams. In: M. M. Beyerlein & D. A. Johnson (Eds), Advances in Interdisciplinary Studies of Work Teams (Vol. 1, pp. 119–144). Greenwich, CT: JAI Press.
152
PETER CHALOS ET AL.
Villadsen, B. (1995). Communication and delegation in collusive agencies. Journal of Accounting and Economics, 19, 315–344. Winer, B. J. (1993). Statistical principles in experimental design. New York, NY: McGraw-Hill. Wittenbaum, G. M., & Stasser, G. (1996). Management of information in small groups. In: J. L. Nye & A. M. Brower (Eds), What’s Social About Social Cognition? Social Cognition Research in Small Groups (pp. 3–28). Thousand Oaks, CA: Sage. Young, S. M., & Selto, F. H. (1993). Explaining cross-sectional workgroup performance differences in a JIT facility: A critical appraisal of a field-based study. Journal of Management Accounting Research (Fall), 300–326.
PERFORMANCE EVALUATIONS, WITH OR WITHOUT DATA FROM A FORMAL ACCOUNTING REPORTING SYSTEM Yin Xu and Brad Tuttle ABSTRACT The purpose of the study is to examine whether superiors (i.e. principals), who evaluate the performance of their subordinates (i.e. agents), take information asymmetry into account by assuming that subordinates shirk when the accounting system does not provide information on subordinates’ effort levels. A decision making experiment was conducted to examine the effect of information asymmetry on effort attribution and the effect of effort attribution on performance evaluation. The results show that the presence of an agency problem significantly affected managers’ beliefs regarding the level of effort they attributed to the subordinate, which affected their evaluation of the subordinate.
INTRODUCTION Performance evaluations are important in every organization because they serve as a basis for pay raises, promotions, demotions, and even for continued employment. In many instances, performance evaluation is made difficult because the evaluator cannot directly observe how much effort the subordinate exerts in his work. When this happens, the agent is said to have private information about Advances in Accounting Behavioral Research Advances in Accounting Behavioral Research, Volume 7, 153–168 Copyright © 2004 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1474-7979/doi:10.1016/S1474-7979(04)07007-3
153
154
YIN XU AND BRAD TUTTLE
his actions giving rise to information asymmetry because the subordinate knows how much effort he exerts but the evaluator does not. Under conditions of information asymmetry, agency theory suggests that agents can pursue their own self-interest at the expense of the principal (Beaver, 1998; Eisenhardt, 1989). A number of experimental studies confirm that when agents possess privately held information they often reach decisions that are in their own self-interest even when their decisions are contrary to the interests of their firm (e.g. Harrell & Harrison, 1994; Tuttle et al., 1997). This being the case, the present study examines whether principals attribute lower levels of effort to agents when the principal lacks reliable information about the agent as compared to when the principal has better information. It further tests whether these attributions affect the principal’s opinion of the agent. Such evaluative judgments are important in a multiple period agency context because subsequent period contracts are likely to be influenced by the principal’s opinion about how hard a particular agent has worked in prior periods. Attribution theory provides a model of how people form evaluative judgments in the absence of complete information. This model suggests that, given an observed outcome, individuals first make inferences as to what caused the outcome before forming opinions about the person who affected the outcome. Possible attributions to explain a given outcome include luck, task difficulty, ability, or effort level. These attributions mediate the relationship between outcomes and the evaluative judgment and determine the response or action that the superior takes for any given outcome (Green & Mitchell, 1979; Knowlton & Mitchell, 1980). One important contribution of the present study is its integration of social and economic paradigms. Researchers have typically relied on economic theories to explain contracting behavior and social theories to explain opinion formation. The present study proposes that social theories may have a natural link to agency theory because information asymmetry influences the principal’s opinion of how hard the agent works and that this opinion will likely affect future contracts. Hence, an important contribution is the integration of attribution theory into an agency setting. An experiment was conducted in which highly experienced business owners and general managers projected themselves into a hypothetical evaluation of a subordinate whose performance either met or did not meet their expectations. The subordinate was removed geographically from the principal and the situation manipulated so that the owner either could (information symmetry condition) or could not (information asymmetry condition) determine how much effort was exerted by the subordinate. In the information asymmetry condition, participants attributed lower effort to the subordinate than in the information symmetry condition. Attributions of effort, in turn, mediated the effect of outcome on the owners’ performance evaluations.
Performance Evaluations
155
The remainder of this paper is organized as follows. The next section provides a brief overview of the basic research and a development of hypotheses. Then the research method is described, followed by the analysis of the results. The final section discusses the limitations and implications of the study.
BACKGROUND AND HYPOTHESES Agency theorists have focused on identifying situations in which the interests of the principal and agent are likely to conflict and on depicting the control mechanisms that limit the agent’s self-serving behavior. An agency relationship exists when an individual (principal) hires other individuals (agents) to act in the principal’s behalf (Baiman, 1982, 1990). Agency problems can arise in situations where it is difficult for the principal to verify what the agent is actually doing. In this case, the agent may shirk by pursuing his own interests that may or may not be consistent with those of the principal. One control mechanism is to implement an accounting information system that provides information to the principal about the agent’s behavior (Fama, 1980). According to agency theory, information such as accounting information plays a crucial role in addressing agency problems. From an agency framework, one of two states of information are assumed to exist: (a) information symmetry; or (b) information asymmetry. Under the condition of information symmetry the principal knows what efforts the agent has exerted. In this case, the agent does not have an opportunity to shirk because the principal will detect the agent’s behavior. Under the condition of information asymmetry, however, the principal lacks some or all information about the agent’s level of effort so that the principal does not know exactly what the agent has done. In this case, shirking may go undetected by the principal. Lack of accounting information is frequently the source of this asymmetry. A number of studies provide support for the prediction that agents shirk when the principal lacks information about their efforts (e.g. Harrell & Harrison, 1994; Tuttle et al., 1997). One way to solve such problems is to better monitor the efforts of the agent by investing in comprehensive accounting information systems that track performance and process information in addition to traditional financial data. But this kind of monitoring often is too expensive or impractical for the principal to implement, particularly for smaller organizations. When high information asymmetry makes it difficult for the principal to obtain good measures of the agent’s effort, the principal will have low confidence in his/her effort attribution. If any compensation or contract is based on such noisy effort attribution, it will be very risky for the agent and the agent will normally ex ante refuse accepting such a contract. For example, a shop foreman would refuse a
156
YIN XU AND BRAD TUTTLE
contract that ties a large portion of compensation to performance when outcomes are affected by supply shortages and breakdowns outside his/her control. In this case, the principal will seek to provide an alternative contract whose payoff is not so dependent on effort. Knowing that the contract payoff is not so dependent on effort attribution the agent will slack off. The above arguments clearly suggest that information asymmetry leads to the possibility of the agent shirking. Situations frequently arise in which principals or superiors lack complete information about the effort level of their agents or subordinates, yet the superior is placed in the position of evaluating the agent/subordinate’s performance. We argue that principals take the possibility of shirking into account when evaluating their subordinates. According to attribution theory, for any observed outcome, an individual’s level of performance on a task will be attributed either to factors within the person (internal factors) or to factors outside the person (external factors) (Heider, 1958). This is termed locus of causality (Tongtharadol et al., 1991; Weiner et al., 1972). Consistent outcomes associated with a particular person over time have been shown to be attributed to ability (internal) and to task difficulty (external) and inconsistent outcomes tend to be attributed to effort level (internal) and to luck (external). In early iterations of a multiple-period agency problem, there has not yet accumulated sufficient outcomes for the principal to evaluate consistency. In this case, we assert that information asymmetry will influence the attribution towards effort. In summary, the above arguments suggest that an agent is likely to shirk when the principal cannot completely monitor his effort. That is, a subordinate is expected to exert less effort when he knows his superior cannot find him out. Since the superior knows this, it is reasonable to assume that the superior will attribute less effort to subordinates under conditions of information asymmetry than under information symmetry. These arguments lead to the first hypothesis. H1 . The principal will attribute lower levels of effort to an agent under conditions of information asymmetry than under conditions of information symmetry. H1 not only has theoretical implications, but also is included as a validation of the experimental design. If H1 is supported, it will evidence that the subjects perceive that an agency problem exists when they cannot directly observe the agent’s effort. To see this, suppose that no significant difference results between the symmetric and asymmetric information cases. In this case one could conclude that no principal agent problem exists, or that the principal thinks the agent exerts the same effort whether or not his effort exertion is observed, or that the subjects in the experiment had misunderstood the instructions. Support for H1 confirms both that a principal agent problem exists and that the subjects understand and react to its implications.
Performance Evaluations
157
The likelihood of achieving positive outcomes typically is a positive function of the agent’s behavior. Thus, when the principal cannot directly observe the agent’s behavior, it is more likely for the principal to make effort attributions based on what they can observe (i.e. outcome). It follows that, holding information asymmetry constant, to the extent the outcome is more positive, the principal still will attribute a higher level of effort to the agent. H2 summarizes this expectation. H2 . The outcome of the agent’s performance will positively influence the principal’s attribution of the agent’s effort. Agency theory assumes that the agent’s effort level influences the outcome and thus the value to the principal of employing the agent work (Kreps, 1990). This suggests that at the time of contracting, the principal will take into consideration any beliefs about how much effort the particular agent is likely to exert. These beliefs should affect the principal’s choice of contracts having different pay to performance sensitivity and are likely formed by observing outcomes over time, i.e. over multiple contracting periods. Research suggests that attributions about effort directly influence how superiors ultimately evaluate their subordinates (e.g. Green & Mitchell, 1979). This research indicates that internal attributions, i.e. to effort and ability, have a greater impact on evaluative judgments than external attributions (Weiner et al., 1972). Further, studies provide consistent evidence that supervisors tend to link rewards and punishments for performance more to effort than to ability because effort is seen as being under the subordinate’s immediate control (e.g. Knowlton & Mitchell, 1980; Weiner et al., 1972). This suggests that combinations of information asymmetry and outcomes that lead to attributions regarding effort will influence the principal’s evaluation of the agent’s performance: H3 . The principal’s attributions of effort will positively influence the principal’s evaluation of the agent. No logic or evidence is available to our knowledge to support an unmediated and direct association from information symmetry to performance evaluation. There is intuition, however, to suggest that a direct association between outcome and performance evaluation may exist. When a principal lacks complete information about effort, it is likely that the principal will have lower confidence in his attribution for the agent’s performance than when the principal has better information. This lack of confidence may lead the principal to base the evaluation on the outcome information because it is perceived to be more reliable than the attribution. Therefore, while not specifically predicted by attribution theory, we test the possibility that a direct path from outcome to performance evaluation exists.1
158
YIN XU AND BRAD TUTTLE
Fig. 1. Hypothesized Model of Information Asymmetry, Performance Outcome, and Perceived Effort in the Performance Evaluation Process.
Based on this discussion, we propose a model of performance evaluation that is depicted in Fig. 1. The path from information asymmetry to attribution suggests that information asymmetry is negatively associated with perceived effort as set forth in H1 . The path from outcome to attribution (H2 ) indicates that outcome is positively associated with perceived effort. The path from attribution to performance evaluation (H3 ) suggests a positive relationship. Finally, the model indicates that outcome may have a direct effect on performance evaluation.
METHOD Research Design An experiment was conducted using a between-subjects design to test the hypotheses stated above. Information (symmetric versus asymmetric) and outcome (met versus did not meet expectations) served as the independent variables. The dependent measures were: (1) the superior’s attributions of the subordinate’s effort level; and (2) the superior’s evaluation of the subordinate’s performance. Participants In order to locate a suitable sample of subjects that fit the constraints of the study (i.e. individuals who act in the role of a principal and who are familiar with substantial agency issues), a search of the Internet was made for trade associations that listed owners and general managers. A list of members of the National Association of Metal Finishers was obtained and members with titles of owner, president, vice president, general manager, and division manager were included in our selection. Using this database, a sample of 400 participants was obtained for which addresses were located using Internet based business locators.
Performance Evaluations
159
Decision cases were mailed with a cover letter to each addressee. Since confidentiality regarding performance evaluation can be sensitive to managers we emphasized, in the cover letter, a promise of anonymity. Due to institutional constraints and the relatively high response rate, a single mailing was used. Of the 400 surveys mailed, 15 were returned as undeliverable, leaving 385 surveys available to subjects. Of the 79 that were returned, 5 were incomplete, producing 74 (19.2%) usable responses. A 2 test does not reveal any significant difference in response frequencies among the four treatment conditions. Table 1 presents detailed subjects’ demographic information. The median age of the respondents is 50 years. On average they have 26 years of work experience and supervise an average of 50 individuals. Most individuals work in firms with fewer than 100 employees but approximately 30% have between 101 and 500. It appears that the sample clearly meets the experience and knowledge characteristics desired for the study.
Table 1. Participant Demographics. Panel A Demographic Variable
Number of Participants
Age 30 or younger Between 30 and 50 Between 50 and 70 70 or older
1 39 28 6
Formal education 0 to 8 years 9 to 12 13 to 16 17 or over
0 9 41 24
Firm size 0 to 100 employees 101 to 500 501 to 1,000 1,000 or over
53 20 1 0
Panel B Mean Years of full time work experience Maximum number of individuals supervised
26 50
160
YIN XU AND BRAD TUTTLE
Decision Task The participants were asked to project themselves into the role of the owner or general manager of a hypothetical company and asked to assess the level of effort and evaluate performance for a subordinate based on a short decision case. A sample copy of the instrument is shown in the Appendix. The experimental materials were specifically constructed to fit a situation familiar to the metal finishing industry from which the participants were obtained while corresponding to the theoretical constructs of interest. Participants were informed that the business strategy of their firm is to provide customers with high quality products and service and that as the owner or general manager they had specifically told all shop foremen that quality production and service is their primary goal. The decision case described a situation in which the participant (as the owner or the general manager of Midwest Metal) had recently purchased a second shop located in a nearby state. Initially, the case suggested that the participant had spent a considerable amount of time at the new shop, but it had been over a month since they were there. At this point in time, the participants were told that they were reviewing the results of last month’s activities for the new shop. Within this setting the independent variables related to outcome and information were manipulated.
Outcome Manipulation Outcome was operationalized by varying whether or not the product and service quality (i.e. the foreman’s primary goal) at the new shop met the owner or manager’s expectation. Comprehensive accounting systems, such as the Kaplan’s Balanced Scorecard, typically include information to help evaluate performance related to product and service quality whereas traditional financial accounting systems do not (Tuttle & Ullrich, 2003). Under the condition of positive outcome, participants were informed that product and service quality at the new shop met their expectations, coded as 1. Under the condition of negative outcome, participants were told that the quality at the new shop did not meet their expectations, coded as 0.
Information Manipulation Information was manipulated at two levels: (1) information symmetry in which the owner or manager could determine the effort level of the foreman, coded as 0; and (2) information asymmetry in which the owner or manager could not determine the
Performance Evaluations
161
foreman’s level of effort, coded as 1. Participants under the condition of information symmetry were provided with the following information: The shop foreman at the new shop knows how important high quality is to you but also knows that you will have no way of discovering how much effort he spent last month to achieve his quality goals.
Participants under information symmetry condition were given information as follows: The shop foreman at the new shop knows how important high quality is to you and also knows that you will eventually discover how much effort he spent last month to achieve his quality goals.
It is important to note that the participants were provided with a situation in which information was either symmetrical or asymmetrical but were not provided with information about the actual effort level of the subordinate. Thus the participants were set in a context in which the foreman believed his situation to be either symmetrical or asymmetrical with respect to information about his effort. Had we revealed to the participants how much effort the foreman had actually exerted, then it would be impossible to isolate the effects of information asymmetry from the effect of receiving this specific effort information on the performance evaluation judgment.
Measured Variables Two variables were measured in this study: (1) perceived effort; and (2) performance evaluation both using an 11-point Likert scale. Subjects were asked to rate the shop foreman’s effort toward achieving the quality goal (1 = very low effort, 11 = very high effort). They were also asked to rate the foreman’s performance evaluation (1 = very poor performance, 11 = very good performance).
RESULTS Preliminary Analysis Initial tests were performed to assess the effectiveness of the randomization process and to check the manipulations. The demographic variables were examined to see if differences exist between experimental conditions based on age, education, work experience, the number of employees supervised, and firm size. Analyses of variance show no significant main effects or interactions of the two
162
YIN XU AND BRAD TUTTLE
independent variables on any of the demographic variables, suggesting that subject demographics do not differ significantly among the four treatment conditions. Also, nine participants responded incorrectly to one of the manipulation check questions. Separate analyses were performed with and without the data from these nine subjects. The results remain unchanged as to their statistical significance and interpretation. The analyses described below include the responses from all 74 participants.
The Model Fit Tests A path analysis was conducted to assess the appropriateness of the overall theoretical model shown in Fig. 1. We used four criteria for assessing the model fit. These criteria include: (1) the ratio of 2 to the degrees of freedom (2 /df); (2) the Bentler-Bonett non-normed fit index (NNFI); (3) the Bentler comparative fit index (CFI); and (4) the root-mean-square error of approximation (RMSEA). It is suggested that the 2 ratio and fit indices for RMSEA should not exceed 2.0 and 0.06, respectively, and the NNFI and CFI should not fall below 0.95 (Hoetler, 1983; Hu & Bentler, 1998). Based on these benchmarks, the proposed model appears to fit the data well. As shown in Table 2, the RMSEA is 0.000 and the NNFI and CFI are 1.039 and 1.000, respectively. The chi-square to degrees of freedom ratio is 0.017 and also indicates an acceptable fit.
Tests of Hypotheses The path coefficients that specify the hypothesized causal relationships for the theoretical model are tested next. H1 predicts a negative relationship between information asymmetry and effort attribution. As shown in Fig. 2, the path coefficient from information asymmetry to perceived effort is −0.17. Consistent with the hypothesis, the coefficient for this relationship is negative and significant
Table 2. The Goodness of Fit Indices. 2
df
2 /df
RMSEA
NNFI
CFI
0.017
1
0.017
0.000
1.039
1.000
Notes: N = 74; RMSEA = root mean square error of approximation; NNFI = non-normed-fit index; CFI = comparative fit index. ∗ p < 0.001.
Performance Evaluations
163
Fig. 2. The Results of Standardized Path Coefficients of the Path Model. Note: ∗ p < 0.05 and ∗∗ p < 0.001.
(p < 0.05), suggesting that principals assign lower effort to agents under the condition of information asymmetry than under the condition of information symmetry. Therefore, H1 is supported. H2 predicts a positive relationship between outcome and effort attribution. As shown in Fig. 2, the path coefficient from outcome to effort attribution is +0.52. Consistent with the hypothesis, the coefficient for this relationship is positive and significant (p < 0.001), suggesting that effort attributions are influenced by outcome. Therefore, H2 is supported. H3 predicts a positive relationship between effort attribution and performance evaluation. As indicated in Fig. 2, the path coefficient from effort attribution to evaluation is +0.67. Consistent with the hypothesis, the coefficient for this relationship is positive and significant (p < 0.001), suggesting that higher effort attributions lead to higher performance evaluations. Thus, H3 is supported. We also test the direct path from outcome to evaluation as suggested by intuition. This path is positive (r = +0.37) and significant (p < 0.001) suggesting that attributions only partially mediate the effect of outcome on performance evaluations. As can be seen in Fig. 2, however, the standardized path coefficient for effort attributions to performance evaluation is substantially greater (0.67) than the standardized path coefficient from outcome to performance evaluation (0.37). This suggests that attributions about effort strongly mediate the relationship between outcome and performance evaluation – a finding that is entirely consistent with attribution theory.
DISCUSSION Before discussing the implication of the findings, it is important to review some of the limitations and strengths of the study. The data were collected using a short hypothetical case and with a sample of convenience. Therefore, care should be
164
YIN XU AND BRAD TUTTLE
exercised when extrapolating the findings to other situations and contexts. Given the research question, however, conducting the study in the field helped us to obtain very experienced subjects who themselves were owners and high level managers in companies of sufficient size and nature to experience agency issues like those we examine. Furthermore, we used a between subjects design with random assignment to treatment thus permitting us to draw causal inferences that are not always possible from small sample or archival field data. This study developed a model of performance evaluation within the framework of both agency and attribution theory. The findings from this study support this model and demonstrate that information asymmetry and performance outcomes have significant influence on the superior’s attributions about the subordinate’s effort. Furthermore, these attributions strongly mediated the outcome on the principal’s evaluation of the agent’s performance and are consistent with previous attribution research (e.g. Green & Mitchell, 1979). From an accounting point of view, the process of performance evaluation may be viewed as a control used to reinforce desirable behavior. As such, this study relies on the assumption that opinions about the work ethic of a subordinate are an important contracting variable in an agency setting. Hence, performance evaluations have the potential to influence the nature of subsequent contracts. Often, however, it is difficult to obtain good measures of effort and this is likely to influence the principal/superior’s effort attribution and confidence in their effort attribution. In this case, the agent may see performance evaluations and subsequent pay decisions based on such effort attributions as being unfair or too risky. To the extent that accounting information systems reduce information asymmetry, these problems may be avoided. We point out, however, that the present study only examined one type of attribution, attributions about effort. Future researchers may want to examine other aspects of attributions and incorporate a multiple period setting. Future research may also want to investigate other types of agency problems than the one studied here. In essence, the information asymmetry problem addressed in the present study is the classic moral hazard problem where the agent chooses to exert an unobservable level of effort that will improve the outcome. Implicit in our study is that effort on the part of the agent results in improved performance. Of course, other conditions are possible. In the attribution literature, the link between effort and results is subsumed under attributions to task difficulty (i.e. an external attribution). Because the subjects were randomly assigned to treatment condition, individual tendencies to infer stronger or weaker links between effort and outcomes are controlled for along with other individual differences such as risk preferences, etc. Future studies may wish to examine the effects of individual differences on different kinds of agency problems.
Performance Evaluations
165
Although we designed this study to integrate findings from previous research, our work is different from previous efforts in some ways. One aspect in which this study differs most from prior research is that we directly examined principal’s behavior rather than agent’s behavior given the condition of information asymmetry/symmetry. The results of this study provide evidence that principals do take information asymmetry into account when making causal attributions for positive and negative outcomes of subordinates. We believe that this constitutes an important contribution to the accounting literature. The results have other implications for the literature regarding information systems and contracting. For example, there is some question about how the risk inherent in the information system affects contracting. Suppose that there is an agency problem, so that the manager does not observe effort. Instead the manager designs a pay for performance contract. Then a question one might ask is whether an inherently riskier outcome will be associated with a contract where payment is more sensitive to performance. The controversy is that while some have argued that there is a negative relation between risk and pay to performance sensitivity (see Aggarwal & Samwick, 1999; Garen, 1994), others have argued the opposite. In fact, there are questions whether empirical research on this point is likely to be fruitful at all (see Haubrich, 1994). Further, these relations have not been extensively analyzed in a formal manner, which at least in part, explains the controversy (see Kim, 1995, for theoretical analysis on this point). The results of the present study only indirectly address this issue, whereas, future research could be conducted to try to more directly address this question. One would need to be careful in the design, as the inherent risk or variance of the outcome would need to be distinguished from the covariance between effort and outcome. But this would be a clear contribution to the literature that could be linked to other work currently being done. We hope that this study will stimulate other researchers to pursue these issues.
NOTE 1. We thank one of the reviewers for suggesting this possibility.
REFERENCES Aggarwal, R. K., & Samwick, A. A. (1999). The other side of the trade-off: The impact of risk on executive compensation. Journal of Political Economy, 107(1), 65–105. Baiman, S. (1982). Agency research in managerial accounting: A survey. Journal of Accounting Literature, 1, 154–213.
166
YIN XU AND BRAD TUTTLE
Baiman, S. (1990). Agency research in managerial accounting: A second look. Accounting, Organizations and Society, 15, 341–371. Beaver, W. H. (1998). Financial reporting: An accounting revolution (3rd ed.). Prentice-Hall. Eisenhardt, K. M. (1989). Agency theory: An assessment and review. Academy of Management Review, 14, 57–74. Fama, E. (1980). Agency problems and the theory of the firm. Journal of Political Economy, 88, 288–307. Garen, J. E. (1994). Executive compensation and principal-agent theory. Journal of Political Economy, 102(6), 1175–1199. Green, S., & Mitchell, T. R. (1979). Attributional processes of leaders in leader-member interactions. Organizational Behavior and Human Performance, 23, 429–458. Harrell, A., & Harrison, P. (1994). An incentive to shirk, privately held information, and managers’ project evaluation decisions. Accounting, Organizations and Society, 19, 569–577. Haubrich, J. G. (1994). Risk aversion, performance pay, and the principal-agent problem. Journal of Political Economy, 102(2), 258–276. Heider, F. (1958). The psychology of interpersonal relationships. New York: Wiley. Hoetler, J. W. (1983). The analysis of covariance structures: Goodness of fit indices. Sociological Methods and Research, 11, 325–344. Hu, L., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized. Psychological Methods, 3, 424–453. Kim, S. K. (1995). Efficiency of an information system in an agency model. Econometrica, 63, 89–102. Knowlton, W. A., & Mitchell, T. R. (1980). Effects of causal attributions on a supervisor’s evaluation of subordinate performance. Journal of Applied Psychology, 65, 459–466. Kreps. D. M. (1990). A course in microeconomic theory. Princeton, NJ: Princeton University Press. Tongtharadol, V., Reneau, J. H., & West, S. G. (1991). Factors influencing supervisor’s responses to subordinate’s poor performance: An attributional analysis. Journal of Management Accounting Research, 3, 194–212. Tuttle, B., Harrell, A., & Harrison, P. (1997). Moral hazard, ethical considerations, and the decision to implement an information system. Journal of Management Information Systems, 13(4), 7–27. Tuttle, B., & Ullrich, M. J. (2003). The effects of incentive structure and goal challenge on time planning decision within a Balanced Scorecard framework. Advances in Accounting Behavioral Research, 6, 121–144. Weiner, B., Frieze, I., Kukla, A., Reed, L., Rest, S., & Rosenbaum, R. (1972). Perceiving the causes of success and failure. In: E. Jones, D. Kanouse, H. Kelley, R. Nisbett, S. Valins & B. Weiner (Eds), Attribution: Perceiving the Causes of Behavior. Morristown, NJ: General Learning Press.
APPENDIX: SAMPLE RESEARCH INSTRUMENT Information Symmetry and Positive Outcome Condition Background Assume that you are the owner or the general manager of Midwest Metal Inc. Your business strategy requires that you provide customers with high quality products and service. It is often difficult to maintain high quality all the time. Sometimes
Performance Evaluations
167
the products and services in the shop have not met your standards of high quality. Therefore, you tell all your shop foremen that their primary goal is to produce high quality products and service. Although you have stressed this goal to your shop foremen, you understand that sometimes, despite their best efforts, quality problems may still arise. Performance Evaluation Assume that you recently purchased a second shop located in a nearby state. Initially, you spent a considerable amount of time at the new shop, but it has been over a month since you were there. The shop foreman at the new shop knows how important high quality is to you and also knows that you will eventually discover how much effort he spent last month to achieve his quality goals. Outcome You are reviewing the results of last month’s activities for the new shop. You find that last month, product and service quality at the new shop met your expectations. Decision At this point, you are evaluating the foreman at the new shop. (a) Using the following scale, how would you rate the foreman’s effort toward achieving his quality goal (circle one)? Very Low Effort
1 . . . 2 . . . 3 . . . 4 . . . 5 . . . 6 . . . 7 . . . 8 . . . 9 . . . 10 . . . 11
Very High Effort
(b) Using the following scale, how would you rate the foreman’s job performance (circle one). Very Poor Performance
1 . . . 2 . . . 3 . . . 4 . . . 5 . . . 6 . . . 7 . . . 8 . . . 9 . . . 10 . . . 11
Very Good Performance
Information Asymmetry and Negative Outcome Manipulations Background Assume that you are the owner or the general manager of Midwest Metal Inc. Your business strategy requires that you provide customers with high quality products and service. It is often difficult to maintain high quality all the time. Sometimes
168
YIN XU AND BRAD TUTTLE
the products and services in the shop have not met your standards of high quality. Therefore, you tell all your shop foremen that their primary goal is to produce high quality products and service. Although you have stressed this goal to your shop foremen, you understand that sometimes, despite their best efforts, quality problems may still arise. Performance Evaluation Assume that you recently purchased a second shop located in a nearby state. Initially, you spent a considerable amount of time at the new shop, but it has been over a month since you were there. The shop foreman at the new shop knows how important high quality is to you but also knows that you will have no way of discovering how much effort he spent last month to achieve his quality goals. Outcome You are reviewing the results of last month’s activities for the new shop. You find that last month, product and service quality at the new shop did not meet your expectations. Decision At this point, you are evaluating the foreman at the new shop. (a) Using the following scale, how would you rate the foreman’s effort toward achieving his quality goal (circle one)? Very Low Effort
1 . . . 2 . . . 3 . . . 4 . . . 5 . . . 6 . . . 7 . . . 8 . . . 9 . . . 10 . . . 11
Very High Effort
(b) Using the following scale, how would you rate the foreman’s job performance (circle one). Very Poor Performance
1 . . . 2 . . . 3 . . . 4 . . . 5 . . . 6 . . . 7 . . . 8 . . . 9 . . . 10 . . . 11
Very Good Performance
UNRAVELING THE EXPECTATIONS GAP: AN ASSURANCE GAPS MODEL AND ILLUSTRATIVE APPLICATION Kimberly Gladden Burke, Stacy E. Kovar and Penelope J. Prenshaw ABSTRACT The importance of alignment between users’ and providers’ expectations of accounting services has long been recognized as paramount in the auditing profession. The importance of expectations, and especially expectations gaps, is even more compelling for new assurance services, where the importance of marketing the service is pronounced. This paper develops the Assurance Gaps Model, which describes expectations gaps in general, defining these holistic differences between users’ and providers’ perceptions of assurance services as assurance gaps. The model suggests that assurance gaps really have a number of components – expectations, evaluations of performance and disconfirmation – all of which impact users’ satisfaction with the service. The magnitude of each of these components, as well as the emphasis placed on each one, is important in describing the nature of the gap. This model is consistent with previous research in auditing as well as a large body of research in marketing studying expectations and the satisfaction process (Oliver, 1997). To illustrate potential applications of the Assurance Gaps Model, hypotheses are developed and tested using an online simulation of the ElderCare assurance service proposed by the
Advances in Accounting Behavioral Research Advances in Accounting Behavioral Research, Volume 7, 169–193 © 2004 Published by Elsevier Ltd. ISSN: 1474-7979/doi:10.1016/S1474-7979(04)07008-5
169
170
KIMBERLY GLADDEN BURKE ET AL.
AICPA/CICA. Results indicate that users and providers demonstrate similar magnitude of each of the factors in the model, but differ in that users emphasize performance in forming satisfaction judgments while providers emphasize expectations. The study and results illustrate the usefulness of the model for performing detailed analysis of assurance gaps and for suggesting appropriate courses of action to manage the factors that contribute to them.
INTRODUCTION By the time Liggio (1974) applied the phrase “expectations gap” to auditing, the profession was already experiencing some of the high costs associated with failing to meet the public’s expectations. Within four years, the Cohen Commission would be charged with considering the nature of the gap and what could be done to narrow it. Since then, the effects ascribed to the expectations gap, including increased litigation (Porter, 1993) and the threat of lost self-governance (Gramling, Schatzberg & Wallace, 1996) have been daunting, and the commentaries and research conducted in this area have been voluminous (for example, Anderson, Lowe & Reckers, 1993; Lowe, 1994; Miller, Reed & Strawser, 1991). Most recently, the high profile scandals involving Enron, Worldcom and others, the ongoing commentary about the role of the audit process and financial reporting standards, and the passage of the Sarbanes-Oxley Act make clear the presence and tenacity of this gap. With renewed scrutiny of the profession’s provision of its existing services and the profession’s continued efforts to provide innovative, nontraditional assurance services aimed at new groups of users we must ask: do we understand the expectations gap well enough to prevent our history from repeating itself? For providers of new assurance services and for the profession, it is both strategic and timely not only to challenge our understanding of the existing expectations gap, but also to anticipate the potential for new expectations gaps arising from new services. Thus, this paper’s examination of both the nature of the expectations gap for existing and new services, as well as consideration of the potential effects of that gap is imminently necessary and important. To accomplish these goals, this paper draws on research related to the audit expectations gap as well as broader expectations research in marketing to develop an Assurance Gaps Model that describes the components and structure of expectations gaps in general. The model recognizes that the expectations gap is really a more holistic gap in the level of assurance perceived by users and providers, and that this gap results from differences in their satisfaction processes for an assurance service, hence the name assurance gap. This conception of
Unraveling the Expectations Gap
171
assurance gaps fosters a more sophisticated understanding of differences between users and providers by recognizing that dissatisfied users are typically driven by more than simply expectations. And, understanding more fully how users become dissatisfied provides keen insight to the providers of assurance services about how they can avoid or mitigate assurance gaps. To demonstrate its application, the Assurance Gaps Model is applied in the context of new assurance services, developing hypotheses related to the nature of the assurance gap for these services and testing these hypotheses using an online study of potential users and providers for one new assurance, ElderCare. This application demonstrates one potential approach for using the model to examine assurance gaps in a specific context and shows how the model can provide useful guidance for practitioners as they structure their services and contemplate the potential impact of assurance gaps on users’ responses to these services.
A MODEL OF ASSURANCE GAPS While auditors, in particular, have long been accustomed to the term “expectations gap,” any model that attempts to describe and capture the essence of assurance gaps not only in auditing, but also in a broader context, must carefully consider their nature and scope. The following sections explicitly define assurance gaps and specify their components and form.
Defining Assurance Gaps In developing a model of assurance gaps, it is first important to provide a precise definition of the term “assurance gap.” Previous research has identified the audit expectations gap in varying ways as: (1) the difference between “society’s expectations of auditors and auditors’ perceived performance” (Porter, 1993, p. 49); (2) the difference between users’ expectations regarding the auditors’ assurance of the financial statements and responsibility that the auditing profession is willing to assume (Kell & Boynton, 1992); or (3) “differences in perception – especially regarding assurances provided – between users, preparers and auditors” (Epstein & Geiger, 1994, p. 60). These definitions differ in terms of the individuals focused on – society, users or preparers (i.e. the company being audited) – and the comparison being made – whether a comparison is made between the individual’s perception of what should and what does happen or whether a comparison is made between the individual’s perception and that of the provider.
172
KIMBERLY GLADDEN BURKE ET AL.
In order to expand the scope and applicability of our model beyond simply audit services, we define the assurance gap broadly as, “The differences, with regard to an assurance service, between users’ and providers’ satisfaction formation processes.” This definition is characterized by its broad focus on users and on the comparison between users’ and providers’ satisfaction as influenced by other determinants (including expectations) rather than simply a comparison of expectations. The decision to focus broadly on users as the individuals of interest stems from a desire for the model to be applicable in a variety of different assurance contexts. In auditing, as evidenced by past research, users may constitute society, management of the audited firm, investment advisors, pension plan managers, financial statement users, bank loan officers or even judges and juries. In other contexts, users of an assurance may be very different, from managers within another firm (for a service like PerformanceView or SysTrust) to the family of an elderly person (ElderCare). Because of the wide variety of potential users and the possibility of different categories of potential users for a specific service (i.e. company management and company stockholders for an audit), any application of the model must define carefully the specific users of interest or carefully model each user group separately. In addition, our definition of assurance gap focuses on the entire satisfaction formation process rather than simply expectations. As is explained in more detail in the next section, this emphasis better captures the breadth of the term “expectations” as it is used in the accounting literature by substituting more distinctive constructs widely used in marketing. By focusing the definition on the satisfaction formation process, we are able to extend our examination to the underlying differences in the components of satisfaction between user and provider, where these components include not only expectations, but also perceived performance and the comparison of expectations and perceived performance, defined as disconfirmation in the marketing literature. Therefore, the comparison of satisfaction and its components between user and provider allows a more complete conceptualization of the different aspects of the assurance gap, recognizing both differences between users and providers and differences between expectations and perceptions of performance (i.e. disconfirmation) for each group. More detail on this point, as well as a detailed description of the model, is provided below.
Modeling Assurance Gaps When examining the expectations gap literature, it becomes clear that the name “expectations gap” seems a misnomer in auditing, connoting that the differences in expectations between financial statement users and CPA providers are sufficient
Unraveling the Expectations Gap
173
to explain the resulting litigation or threat to self-governance. Researchers (Humphrey, Moizer & Turley, 1993; Porter, 1993) who have attempted to characterize the nature of the audit “expectations” gap, however, have recognized, at least implicitly, that differences in expectations alone do not explain the phenomenon. Instead, the outcomes of the “expectations” gap seem to involve a more comprehensive evaluation of satisfaction with the service that incorporates expectations, but also reflects perceptions of the performance of the service and the degree to which those expectations and perceptions are aligned. Indeed, Oliver’s (1997) extensive research in marketing reveals that expectations alone are insufficient for describing satisfaction with products and services. Oliver’s expectancy disconfirmation model of satisfaction (Oliver, 1997) shown in Fig. 1 suggests that satisfaction is a function of three independent variables – prepurchase expectations, performance and disconfirmation. Essentially, the model postulates that consumers form prepurchase expectations of services, evaluate performance outcomes and then subjectively assess the degree to which they perceive that the services meet their expectations, referred to as disconfirmation. Further, the model indicates that expectations, performance and disconfirmation each have a direct effect on satisfaction, implying the following equation: S = ␦E + D + P where S, E, D and P stand for satisfaction, expectations, disconfirmation and performance, respectively, and ␦, and represent the subjective weights placed on each variable by the individual (i.e. each variable’s level of influence) when developing a satisfaction judgment.
Fig. 1. The Expectancy Disconfirmation Model of Satisfaction. Note: Adapted from Oliver, 1997. Variable labels and path coefficients have been added. Other labels are selfexplanatory, represents other subjective factors, besides expectations and performance, that may influence disconfirmation judgments.
174
KIMBERLY GLADDEN BURKE ET AL.
Expectations and performance also have an indirect effect on satisfaction through disconfirmation. While strongly influenced by both expectations and performance, consumers’ disconfirmation judgments may also be influenced by other factors not anticipated in these measures (Oliver & Bearden, 1985; Tse & Wilton, 1988). Therefore, the following equation describes disconfirmation judgments: D = ␣E + P + where ␣ and  represent the subjective weights placed on expectations and performance (i.e. each variable’s level of influence) in arriving at a disconfirmation judgment, and represents other subjective factors influencing disconfirmation. There is intuitive appeal to the relation between the expectations gap in auditing and consumer satisfaction processes as described by Oliver: it is reasonable to assume that a litigant or public official who threatens the self-governance of the accounting profession is dissatisfied in some way with the service received. Additionally, although often couched in different terms, the concepts underlying consumer satisfaction in Oliver’s model are not entirely new to accounting researchers. Porter (1993) incorporates the elements of consumer satisfaction in her description of the typical view of the “expectations” gap as differences “between society’s expectations of auditors and auditors’ performance, as perceived by society” (p. 49). Here, Porter not only specifically identifies expectations and performance as important to defining the expectations gap, but she also implicitly includes disconfirmation, or the comparison of performance to initial expectations. Similarly, in their empirical investigation of the expectations gap in the United Kingdom, Humphrey, Moizer and Turley (1993) investigated not only expectations regarding auditors’ current and potential responsibilities, but also used short cases to elicit feedback specifically regarding auditors’ performance. Thus, characterization of the gap as being only related to expectations is somewhat misleading. An alternative characterization recognizes that the assurance gap reflects user satisfaction/dissatisfaction with services provided, as influenced by expectations, perceptions of performance and disconfirmation. Although the satisfaction formation process described by Oliver has been aggressively studied by marketing researchers for consumer goods (Tse & Wilton, 1988; Westbrook, 1987) and increasingly so for services (Cronin & Taylor, 1992; Halstead, Hartman & Schmidt, 1994), this literature has also paid little attention to differences in the cognitive processes used by potential providers and users to make satisfaction judgments or the implications of these cognitive differences. As we will discuss below, these differences are an important component of the assurance gap. Consequently, our Assurance Gaps Model, shown in Fig. 2, Panel A, extends Oliver’s work to describe the “expectations” gap in terms of
Unraveling the Expectations Gap
175
Fig. 2.
differences between users’ and providers’ components of satisfaction with the services provided. The term “Assurance Gap” rather than “expectations gap” is used in the model and throughout the remainder of the paper to connote differences in the totality of assurance felt by users versus providers as results of their satisfaction with an assurance service. Consistent with Oliver’s model, the Assurance Gaps Model in Fig. 2 demonstrates that users and providers separately develop satisfaction judgments of the same service by developing expectations, evaluating performance and assessing their degree of disconfirmation. Extending Oliver, we posit that their final levels of satisfaction with the service – and thus the size of the assurance gap – may derive from two sources of differences between users and providers – differences in magnitude of satisfaction or its components and differences in influence, or
176
KIMBERLY GLADDEN BURKE ET AL.
the amount of subjective weight placed on each of components in developing a satisfaction judgment. Differences in magnitude refer to differences between providers and users in the absolute level of any of the factors – denoted by the uppercase letters E, P and D in Fig. 2 – that influence satisfaction. For example, if users consistently perceive that services are performed badly (Puser ) when CPAs believe the services are performed quite well (Pprovider ), the magnitude of this difference in perception contributes to the assurance gap. Alternatively, the assurance gap may result from differences in influence – depicted using lowercase Greek letters to describe the path coefficients in the models. These differences reflect cognitive differences in the way users and providers process the factors influencing satisfaction. For example, assume both users and providers have equally high expectations of a service (Euser = Eprovider ). Users’ evaluations of their satisfaction with the service may be dependent on those initially high expectations; whereas, providers may find that their satisfaction is most dependent upon their perceptions of the performance of the service rather than their initial expectations (␦user > ␦provider and user < provider ). Thus, while there is little difference in magnitude between the expectations of providers and users, the difference in influence contributes to the assurance gap. Differences in other subjective factors that influence disconfirmation – denoted by in Fig. 2 – may also influence the gap in a similar way. These differences in the sources of the gap may result from differences in the nature of the service or the nature of the users of the service. While the magnitude of the assurance gap ultimately reduces to the difference between the user’s and provider’s levels of satisfaction, it is important to conceptualize the gap in more comprehensive terms as in Fig. 2. In other words, the gap results from differences in magnitude and influence for each component of satisfaction. The source of the gap is important because it may influence how users respond to the gap and, consequently, the course of action providers must take to reduce the gap. For example, if, as often conceptualized, the audit assurance gap results from differences in the magnitude of expectations between public users and providers, the result may be high levels of disconfirmation for users only when performance – which is normally highly unobservable – becomes observable, as in the case of a bankruptcy. In this case, very high levels of dissatisfaction and a large and obvious assurance gap result only in isolated circumstances, but the result is very costly litigation. To mitigate this problem, careful attention to expectations may be needed. On the other hand, the problem for new assurance services, as we will hypothesize in our illustrative study later in this paper, may deal more with the influence of individual factors. If this is so, the implication may be a more stable and enduring gap that must be managed by careful attention to those factors, such as performance, that are most influential for users.
Unraveling the Expectations Gap
177
Components of the model depicted in Fig. 2 have been tested in previous research. However, a systematic, comprehensive analysis of the differences between users’ and providers’ satisfaction and their satisfaction formation is not available. Auditing research has established a difference in magnitude of many of these satisfaction-related variables between users and providers with respect to auditing. Accounting research has not, however, specifically examined the relative influence of these factors on satisfaction. No research related to any aspect of the gap is available for new assurance services, which are likely to be different than auditing service for reasons described in the next section. Consequently, to better illustrate the value of the model for thoroughly evaluating a variety of services and to put forth a comprehensive approach to testing it, the next section of this paper describes an illustrative study conducted based on the model.
Applying the Model to New Assurance Service In response to a mature market for traditional audit services, in 1993, the American Institute of Certified Public Accountants (AICPA) charged the Special Committee on Assurance Services (SCAS) with identifying ways to reposition the profession for the future. Despite some caution about expanding services after the passage of the Sarbanes-Oxley Act, the SCAS report, released in 1997, is still shaping the future of the accounting profession, affecting both the services provided and the manner in which they are provided, especially for mid-sized firms. With respect to services provided, the Committee recommended and the AICPA has now begun developing a series of assurance services that are “independent professional services that improve the quality of information, or its context, for decision makers” (AICPA, 1997). Specific services developed and offered by practitioners to date include WebTrust, ElderCare, SysTrust and PerformanceView. These services have many similarities to audit services. They often involve similar sampling and evaluation of transactions or activities; they involve forming an opinion based on the evidence; they involve communicating that opinion to a user; and their success relies heavily on the credibility of the provider. However, these services are also very different from traditional audit services. While the individual services are also very different from one another, each of the new assurance services attempts to provide highly useful information to a very well-defined user. Audits, on the other hand are designed to appeal to a much broader, more ambiguous group of users. In addition, users of new assurance services often play an integral role in shaping the nature and scope of the assurance engagement through their interaction with the provider. Conversely, users of audited financial statements typically have no direct influence on the nature and scope of the audit and rarely
178
KIMBERLY GLADDEN BURKE ET AL.
interact with the auditor. Recognizing these differences from traditional auditing services, the AICPA was prompted to simultaneously promote a new emphasis on “customer focus,” including understanding and responding to users needs and challenges, as a necessary competency for the future of the accounting profession. Thus, where users might once have been seen as only passive users of information, with the advent of new assurance services, the AICPA recognizes users as active decision makers who play a crucial role in determining the nature and scope of the engagement. Because of this shift in focus for these new services, and the important need for information related to potential assurance gaps for these services, we have chosen one of them as a good venue for illustrating the application of our Assurance Gaps Model from Fig. 2. Below, we further examine the different components of the potential assurance gap – magnitude and influence – for these new services. Differences in Magnitude Implicit in the audit research involving the assurance gap is the assumption that differences in magnitude of satisfaction and its influences, particularly expectations, have created the assurance gap. To date, several auditing researchers (Epstein & Geiger, 1994; Humphrey, Moizer & Turley, 1993; Porter, 1993) have reported significant differences between the expectations of users and CPAs. With respect to the performance component of satisfaction, the results are mixed. Humphrey, Moizer and Turley (1993) reported significant differences between users’ and CPAs’ assessments of performance, but Porter (1993) found that while auditors have generally higher assessments of performance than users, these differences were not statistically significant. While it is tempting to use these results to hypothesize similar differences in magnitude for new assurance services, one factor may prevent an effective analogy. By its very nature, the auditing research described thus far involves a traditional, mature, and recognized service with which both providers and users have at least some familiarity and experience. This is not the case for new assurance services such as ElderCare, where neither the provider nor the user has any experience with the service and typically little, if any, knowledge of the service to help form their satisfaction judgments. As a result, hypotheses related to the variables of interest – expectations, performance, disconfirmation and satisfaction – are viewed as exploratory and are stated in null form as follows: H1 –H4 . There will be no difference between expectations, disconfirmation, performance evaluations or satisfaction for users and providers for new, nontraditional services so that
Unraveling the Expectations Gap
179
Euser = Duser = Puser = Suser =
Eprovider Dprovider Pprovider Sprovider
Differences in Influence Auditing researchers to date have not explored the relative influence of the factors that impact satisfaction for the providers vs. users. Although not studied in auditing, significant marketing research exists to provide a basis for anticipating and hypothesizing these effects. That expectations play a key role in determining satisfaction for existing products and services is virtually unchallenged. The very newness of assurance services, however, may alter that role for users. One line of past research (Halstead et al., 1994; Oliver, 1997) suggests that the availability of internal sources of information, especially prior experiences, is important for forming salient and reliable expectations. This research implies that providers, who have a greater store of knowledge about accountants and the services they provide as well as more experience with different accounting services, will be able to form stronger expectations than users, who must rely on external sources of information – in particular marketing communications from the provider – to form their expectations of new accounting services. In short, because of their heightened familiarity with accountants and other accounting services, providers will perceive that they have a better idea of what to expect from a new service than will users, thus encouraging providers to rely more heavily on their expectations. Users may also have difficulty developing expectations because of the inherent intangibility of the assurance service. Most services are described as high in experience qualities, so that evaluation of the service offering can only be discerned after purchase or during consumption (Zeithaml & Bitner, 2000). So, even though users may have some information available to them to acquire knowledge about a service, as in the case of this study, their lack of experience with the specific service may make forming an expectation even more difficult. Taken together, this evidence suggests the following hypothesis: H5 . Expectations will have greater influence in the satisfaction process for CPAs than for users for new assurance services, such that ␦provider > ␦user and ␣provider > ␣user Additional research suggests that individuals who do not wish to perceive discrepancies between expectations and performance for ego-defensive reasons will be less likely to engage in disconfirmation judgments (Martin, Seta & Crelia,
180
KIMBERLY GLADDEN BURKE ET AL.
1990; Oliver, 1997). In the current situation, providers who have a significant stake in performance outcomes are expected to have this ego-defensive motivation and are, therefore, expected to be influenced less by disconfirmation than users who have no such motivation. Additionally, as described earlier, users have little prior experience to form the basis for reliable expectations. As a result, they must rely on disconfirmation, a psychological and holistic assessment of whether they got what they expected, (and performance as described in the next section) in developing satisfaction judgments. This leads to the following hypothesis: H6 . Disconfirmation will have greater influence in the satisfaction process for users than for CPAs for new assurance services, such that user > provider Providers’ ego-defensive motivation, coupled with the lack of stable expectations for users, supports the idea of performance having a greater influence for users than providers. Additionally, based on assimilation-contrast theory, performance is expected to have a smaller influence on satisfaction judgments for providers because their strong expectations will result in performance simply being assimilated toward expectations (Oliver, 1997). These predictions lead to the following hypothesis: H7 . Performance will have greater influence in the satisfaction process for users than for providers for new assurance services so that user > provider and user > provider
TESTING THE MODEL A multi-stage, online survey was conducted to test the hypotheses developed above and summarized in Fig. 2, Panel B, for a representative new assurance service, ElderCare. The online survey was constructed to mirror the sequence of communications with potential users in a typical ElderCare engagement, as described below, and to allow us to measure subjects’ expectations, perceptions of performance, disconfirmation and satisfaction at the appropriate times. Figure 3 describes the time line of the study. After providing demographic data, subjects accessed a web site advertisement for ElderCare services created based on the promotional materials published by the AICPA (1999b) and sponsored by the fictitious Taylor CPA Group. Next, subjects read a scenario describing a new client and his interactions with the Taylor CPA Group in contracting for the service. The scenario incorporated a
Unraveling the Expectations Gap
181
Fig. 3. Study Time Line.
detailed description of the service and excerpts from the engagement letter, both of which were prepared based on guidance provided by the AICPA (1997) and Practitioners Publishing Corporation (Lewis et al., 1998). Though written prior to issuance of the Alert, the service described was very similar to the one described in the AICPA’s Assurance Services Alert related to Eldercare (1999a). The service involved an agreed-upon procedures engagement including both non-financial and financial services. For the non-financial services, Taylor CPA Group agreed to regularly review daily log sheets maintained by a care provider, randomly observe the activities of the care provider and conduct biweekly discussions of the quality of the care provider’s work with a geriatric care manager. The care provider’s duties included administering medication for the client’s elderly mother, providing transportation and planning and preparing meals. For the financial services, the Taylor Group maintained the checking account for the elderly mother, paid her routine bills less than $300 and provided a monthly accounting of these activities to the client. Accounting for deposits, bank reconciliations and approval of expenses larger than $300 were performed by the
182
KIMBERLY GLADDEN BURKE ET AL.
client. To avoid confounding, fees for services were described as a fixed monthly fee, but no amounts were specified. Additionally, a standard provision precluding the client’s mother from including the CPA firm in her will was included with the engagement letter excerpt. After subjects read the service description, their expectations of the service were measured. Then, they viewed a description of the actual service provided, including excerpts from Taylor CPA Group’s report. After viewing this information, subjects were asked to evaluate the performance of Taylor CPA Group and describe their level of disconfirmation of expectations and their level of satisfaction relative to the service. The entire survey required an average of 30 minutes to complete. The following sections describe in more detail the subjects in the study and measurements used for the variables in the model in Fig. 1.
Subjects Subjects were a convenience sample; the user group consisted of 370 adults aged 24–65 and the provider group consisted of 62 accountants. Subjects were motivated through a $5 donation to one of several charitable or non-profit organizations. Subjects were contacted through a variety of means. Some were contacted through e-mail lists or at events held by the benefiting charitable organizations. Other subjects were alumni or employees of the researchers’ employing institutions. These individuals were contacted by the researchers via e-mail or advertisement in the university newspaper. All subjects were asked to forward information regarding the study to their family, friends and co-workers who met the criteria for participation; hence, some subjects were contacted in this manner as well. Demographic information about the subjects is provided in Table 1. Though a convenience sample, the relatively well-educated, middle-aged subjects included in the user group reasonably represent the target population of interest for the ElderCare service.
Variable Measures Because of the unique nature of the service, many measures used in the study were either developed especially for the study or adapted to the ElderCare service from previous research. Given the challenge associated with obtaining enough suitable subjects to test the entire model, initial pilot testing of the instrument focused on readability, reasonableness and understandability. Several representative subjects,
Unraveling the Expectations Gap
183
Table 1. Demographic Data. Users Count
Providers %
Count
%
Total number of subjects
370
Gender Male Female
161 208
44 56
42 20
68 32
Previously provided care to an elderly parent Yes 82 No 287
22 78
9 52
15 85
Educational Level High school Associates degree Bachelors degree Masters degree Doctoral/medical/law degree
30 14 136 104 86
8 4 37 28 23
1 1 40 16 4
2 2 64 26 6
43 years
Std. 9 yrs
41 years
Std. 10 yrs
Average age
62
faculty members and ElderCare providers, provided feedback during the pilot test. Each of the variable measures is described individually below. Prepurchase Expectations and Evaluation of Performance Outcomes Prepurchase expectations and evaluation of performance outcomes were both measured using the same twenty-one items developed for the study. Each item consisted of a statement regarding a potential outcome of the ElderCare service. Subjects were asked to indicate their beliefs that the service provided by Taylor CPA Group would result in each outcome using a seven-point Likert scale anchored by “strongly agree” and “strongly disagree.” The twenty-one items measured were developed to reflect four potential dimensions of expectations and performance associated with the service described. The items measuring each dimension are shown in Table 2, and each of the dimensions is described below: (1) Care Provided. These items related to the services provided by the care provider and observed by the CPA. (2) Evaluation. These items addressed providing, or assisting with, evaluations of care and the care provider, as well as assurances regarding the quality of care. (3) Quality of Life. These items dealt with the quality of life and general level of well-being provided to the client and his mother.
184
KIMBERLY GLADDEN BURKE ET AL.
Table 2. Expectations and Performance Measures. Scale/Item
Final Reliability/Status of Item
Care provided
Expectations =0.9565, Performance =0.9490 Included Included
Ensure that my mother receives appropriate physical care. Ensure that my mother has reliable transportation to her physical therapy appointments. Ensure that my mother takes her medication. Ensure that my mother has 3 meals a day. Evaluation Help me to evaluate the competence of care providers. Provide me with an evaluation of the competence of my mother’s care providers. Guarantee a high standard of quality care. Help me evaluate the performance of my mother’s care providers. Provide me with an evaluation of the performance of my mother’s care providers. Provide a comprehensive evaluation of my mother’s actual quality of care. Quality of life Make my life a little easier. Improve the quality of my mother’s life. Let me know everything is ok. Be valuable to me. Allow me to focus on spending quality time with my mother when I visit. Financial services Protect my mother from being taken advantage of financially. Assure that my mother’s money is invested wisely. Assure all of my mother’s bills are paid when due. Result in all of my mother’s money being spent unnecessarily. Provide a report of my mother’s monthly expenses. Provide an accurate record of the services provided to my mother.a a This
item was originally part of the Care Provided subscale.
Included Included Expectations =0.9477, Performance =0.9386 Included Included Included Included Included Included Expectations =0.9267, Performance =0.9440 Included Included Excluded (poor item remainder coefficient) Included Included Expectations =0.8054, Performance =0.8523 Included Excluded (poor item remainder coefficient) Included Excluded (did not load from any factor) Included Included
Unraveling the Expectations Gap
185
(4) Financial Services. These items dealt with the financial services provided by Taylor CPA Group. In assessing the reliability and validity of the item measures, all twenty-one items in both the expectations and performance scales were first analyzed using exploratory common factor analyses with varimax rotation to ascertain the dimensionality of the scales (Hair, Anderson, Tatham & Black, 1992). The same measurement model was used for both user and provider groups, as there was no reason to expect differences in the factor structure of these variables for the two groups. Examination of eigenvalues as well as examination of rotated factor patterns suggested that a four-factor solution offered the most meaningful interpretation of the underlying factors. The four-factor solution also explained a significantly large proportion (99%) of the underlying common variability in scores for both expectations and performance measures. Generally, each item loaded on the same factor for both measures, suggesting a common dimensionality for the prepurchase expectations and evaluation of performance outcomes measures, as expected. Additionally, the dimensions were generally consistent with a priori expectations of the four dimensions as described above. Following the factor analysis, an item analysis was utilized to select the items that most reliably measured the latent construct (Spector, 1992). As shown in Table 2, three items were eliminated from the subscales, one because it did not have a significant loading for any of the underlying factors in the factor analysis, and two other items because the item analysis revealed unfavorable item remainder coefficients. A final item “Provide an accurate record of the services provided to my mother” loaded on the financial services rather than the care provided factor as expected. Further consideration suggests that this item could better reflect an outcome resulting directly from the activities of the CPA as opposed to the care provider, making its inclusion in the financial services dimension reasonable. These analyses, which are summarized in Table 2, resulted in four different subscales each for prepurchase expectations and evaluation of performance outcomes. Each of these subscales possessed high degrees of reliability as measured by Cronbach’s alphas, with values ranging from 0.8054 to 0.9566 (reliabilities for each scale are shown in Table 2). As a result, items representing each subscale were averaged, and the resulting subscale measures were used as indicators of the latent constructs, prepurchase expectations and evaluation of performance outcomes, in the structural model, consistent with the recommendation of Landis, Beal and Tesluk (2000). The goal of this process was to capture as much of the content of each of the underlying constructs, expectations and performance, as possible while still maintaining a reasonable number of indicators in the structural model, given the sample size.
186
KIMBERLY GLADDEN BURKE ET AL.
Disconfirmation Disconfirmation was measured using a three-item, five-point, Likert-type scale suggested by Oliver (1993), Oliver and Swan (1989) and Westbrook (1987). The items required subjects to indicate whether the benefits, outcomes and service overall were relatively better or worse than expected. Exploratory factor analysis indicated that these three items reflected a single, unidimensional construct. Coefficient alpha for the resulting construct was 0.8382. An item analysis was utilized to select the items that most reliably measured the latent construct (Spector, 1992). Based on the item analysis, all three items were included in the structural model. Satisfaction Satisfaction was measured using a four-item, semantic differential scale. The items included in the scale are based on generalized satisfaction measures previously developed by Oliver (1993), Oliver and Swan (1989) and Crosby and Stephens (1987). In contrast to the disconfirmation items, these asked subjects to provide absolute, rather than relative, perceptions of the service. The items utilized bipolar adjective scales that required the subjects to describe their feelings about the service received. The positive adjectives on the scales were pleased, contented, satisfied and made a good choice. Coefficient alpha for the resulting scores was 0.9497. An item analysis was utilized to select the items that most reliably measured the latent construct (Spector, 1992). Based on the item analysis, all four items were included in the structural model.
RESULTS Table 3 provides descriptive statistics for both the accountant and user groups for the study variables. The t-test results shown in the table indicate no significant Table 3. Descriptive Statistics and Mean Comparisons for Study Variables. Accountants
Expectations Disconfirmation Performance Satisfaction a All
Clients
t-Statistic
Meana
Std.
Meana
Std.
5.27 4.76 5.41 5.26
1.00 1.27 1.01 1.29
5.17 4.68 5.17 5.11
1.22 1.29 1.25 1.45
variables measured on 7-point scales.
0.65 0.47 1.46 0.78
p-Value
0.52 0.64 0.14 0.43
Unraveling the Expectations Gap
187
differences between the accountant and non-accountant groups for any of the satisfaction component variables including expectations, performance, disconfirmation and satisfaction. Consequently, the null hypothesis of no difference between the two groups for Hypotheses 1–4 cannot be rejected. This implies that, based on the information provided in the study materials, both potential providers and potential users had similar perceptions of the Eldercare service. While the mean values for all of the variables were above the mean scale value of 3.5, suggesting that these perceptions were somewhat positive, comments from both participant groups might suggest that both were actually quite skeptical of the service. Failure to support Hypotheses 1–4 also suggests that any gaps between accountants and non-accountants for new assurance services might better be defined in terms of gaps in influence as suggested by Hypotheses 5–7, rather than gaps in magnitude. Hypotheses 5–7 were tested using structural equation modeling. The theoretical model to be examined is shown in Fig. 2, Panel A, with the hypotheses to be tested in Fig. 2, Panel B. Covariances between corresponding indicators of expectations and performance were estimated as part of the measurement model since the subscales used as indicators included the same items examined at different points in time, which fails to satisfy assumptions of independence (Anderson & Gerbing, 1988; Bagozzi, 1980; Gerbing & Anderson, 1984). Group comparisons were performed using Lisrel 8 as described by Joreskog and Sorbom (1993). First, the analysis was run assuming that the parameters of the models were identical for both groups. This model serves as a benchmark for evaluating subsequent models used for hypothesis testing where the structural parameters are allowed to vary between groups. This baseline analysis initially revealed a mediocre fit. Modification indices suggested a strong correlation between the error terms for two of the satisfaction measures. Examination of the items revealed strong similarities in their wording. Accordingly, one of these measures was deleted from the model before even considering it as a baseline (Hair et al., 1992). With this improved measurement model in place, the baseline structural model was reestimated and, consistent with Hair et al. (1992), a sample of commonly used goodness of fit measures was examined. The results reveal good fit, but some area for improvement. Chi-square is 260 with 172 degrees of freedom and p < 0.05. While a large p-value is desirable, indicating that the observed covariance matrix of the variables is similar to the model, small p-values are common in samples of this size (Hair et al., 1992). Root means square error of the approximation (RMSEA), which shows the error per degree of freedom and exhibits good fit at values less than 0.05, is 0.049. The comparative fit index (CFI), which indicates good fit at values above 0.90, is 0.99. The normed 2 , which has a recommended level between 1 and 2, is 1.51.
188
KIMBERLY GLADDEN BURKE ET AL.
The next step in the analysis was to free the parameters referred to in the hypotheses, allowing them to differ between groups. The new fit statistics were then compared with the fit statistics from the baseline model to determine whether the new model is significantly better than the baseline. In particular, because the models are nested models (Hair et al., 1992), the difference between the chi-squares of the two models is distributed as chi-square, with degrees of freedom equal to the difference between the degrees of freedom of the two models. To get an overall view of the hypotheses regarding differences in influence for expectations (H5 ), disconfirmation (H6 ) and performance (H7 ) for users versus providers, first, individual structural coefficients were freed and tests were performed to identify any differences in each of the individual parameters. These tests did not reveal any significant differences in the coefficients for any of the individual paths in the structural model. (The largest chi-square was 2.30, with a p-value of 0.13 for the expectations → satisfaction path; all other chi-squares were much smaller.) Based on this information, Hypothesis 6 suggesting a greater influence of disconfirmation in determining satisfaction for users than providers is not supported.
Fig. 4. Results. Note: ∗ indicates coefficient is significant at p = 0.05. All measurement model coefficients are significant. 2 for this model is 254 with 170 df. Compared to the baseline model with 2 of 260 with 170 df, the difference in 2 is 6 with 2 degrees of freedom, which is significant with p = 0.05.
Unraveling the Expectations Gap
189
However, a more careful evaluation of Hypotheses 5 and 7 suggests a difference between users and providers in the relative influence of expectations vs. performance in determining satisfaction. In short, Hypotheses 5 and 7 suggest that providers will be influenced more by expectations than performance and users will be influenced more by performance than expectations. To test this effect more directly, the paths between expectations and satisfaction and performance and satisfaction were simultaneously freed. This new model exhibited significantly better fit than the baseline model, providing support for Hypotheses 5 and 7. 2 for the new model is 254 with 170 df, for a difference in 2 of 6 with 2 degrees of freedom, which is significant with p = 0.05. Figure 4 shows the resulting structural models and coefficients for both users and providers. The difference in influence for the two determinants of satisfaction is as hypothesized – accountants are influenced more by expectations, with a structural coefficient of 0.35 (t = 2.61), than users, with a structural coefficient of 0.00 (t = 0.01); users are influenced more by performance, with a structural coefficient of 0.47 (t = 5.39), than providers, with a structural coefficient of 0.19 (t = 1.33). The practical implications of these results are discussed in the next section.
IMPLICATIONS FOR PRACTICE In terms of new assurance services, the results of this application of the Assurance Gaps Model provides some limited evidence of an emerging assurance gap. This gap, however, does not seem to stem from differences in magnitude, as the study finds no difference in the magnitude of the components of satisfaction – expectations, performance assessments, disconfirmation and satisfaction – between users and providers of ElderCare services. Instead, this gap seems to result from differences in influence, where sample user subjects already emphasize performance more heavily while sample providers emphasize expectations in forming their satisfaction judgments. Thus, while providers may rely on their beliefs about CPAs’ reputations or their past experience, users and potential users will require more evidence or indicators of quality performance of the particular service they have purchased. One possible explanation for the lack of differences in magnitude between users and providers may be that these perceptions reflect the newness of the service – with little knowledge of or experience with the service, both might be more generous about the service. If this is the case, providers should focus on managing user expectations and user satisfaction proactively as the service is provided, avoiding the problems associated with assurance gaps that might result
190
KIMBERLY GLADDEN BURKE ET AL.
as user perceptions evolve over time. However, comments from both groups of participants suggest a more interesting explanation for the similarity between user and provider perceptions. These comments, voluntarily provided by 47% of the potential users and 41% of the potential providers, point to equal degrees of skepticism regarding the service. In particular, both groups are skeptical of the value of the service, as well as the ability of accountants to provide the service effectively. Of those responding, 33% of providers and 18% of users voiced concerns about the quality of care not being assessed. Some 22% of both groups indicated concerns about difficulties verifying the services provided. And, 36% of users and 33% of providers expressed concern about the degree of value/expertise provided by the CPA in this area. Taken together, these comments suggest that a significant amount of work is needed if ElderCare and other new, non-traditional assurance services are to be successful. In particular, revisions to the service may be needed as well as significant efforts to convince both potential providers and users about the value of the new assurance services and to provide more evidence regarding the quality of the services provided. Additionally, practitioners must recognize that the process of managing assurance gaps is ongoing. This study represents vital perceptions of users and providers to a service in early stages of development, providing guidance that can contribute to the initial success of the new service. However, these perceptions may change over time as both groups develop experience with the service, leading to new gaps. Finally, the subjects in this study were largely inexperienced with providing care to elderly parents. This provides an opportunity for expanding the study to develop a large enough sample to examine the perceptions of users who have more first hand experience with the topic of the assurance.
CONCLUSIONS AND OPPORTUNITIES FOR FUTURE RESEARCH The primary objective of this study was to develop a model to more completely describe the sources of assurance gaps in a broad context, recognizing that assurance gaps are important to the profession not only in the traditional context of audits, but also in the context of a variety of new services. The second objective of this study was to illustrate application of the model and a method for testing the model in the context of new assurance services. The Assurance Gaps Model shown in Fig. 2 provides a more thorough framework within which academic researchers can examine the nature, scope
Unraveling the Expectations Gap
191
and implications of assurance gaps in many different areas of accounting. By adapting research from the marketing literature, the Assurance Gaps Model allows accountants to better articulate and specify the components of what has traditionally been referred to as an expectations gap. By extending the research of the marketing literature, this model allows researchers and practitioners to better examine the nature of the assurance gap by identifying two types of contributors: (1) differences in the magnitude of satisfaction and its components – expectations, performance assessments and disconfirmation; and (2) differences in the influence of each of these variables between users and providers. With increased emphasis on marketing of services on the one hand and focusing on users’ needs on the other, careful examination of the model in the context of all of the services provided by accountants would be advisable and can provide a wealth of information to enable practitioners and researchers to better understand and perhaps anticipate users’ responses to professional services. In an audit context, a thorough investigation of the assurance gap components for a wide variety of different users could provide useful information as the profession engages in rebuilding confidence in the integrity of the audit process. While limitations to the illustrative study provided in the paper exist, it provides a useful illustration of the Assurance Gaps Model. First, as demonstrated through development of Hypotheses 1–7, the model provides a basis for systematic evaluation of the sources of assurance gaps. Second, as described in the results and implications for practice, the application of the model can point to very specific courses of action. As with most experimental research, the limitations for these findings must be acknowledged. For the most part, these limitations arise because of the newness of the service. For example, because ElderCare is a new service, there were too few actual users or providers of the service to provide a sample adequate for the purposes of this study. Accordingly, we simulated the communications and activities in a typical ElderCare engagement relying on a sample of representative providers and representative users. While this approach limits the external validity of the study, it allowed us to demonstrate the model’s application in a new context and to address questions of internal validity and timeliness in a significant way. More importantly, the online survey methodology illustrates a very useful method for testing all of the variables in the model, comparing their effects. Future research should continue to examine assurance services in both internally and externally valid contexts. This research can provide insight to both users’ and providers’ perceptions of each service and the source of assurance gaps. Consequently, the potential effects of these gaps may be understood more thoroughly and the appropriate courses of action can be more completely identified.
192
KIMBERLY GLADDEN BURKE ET AL.
REFERENCES American Institute of Certified Public Accountants (1997). Report of the special committee on assurance services [online]. Available: http://www.aicpa.org/. American Institute of Certified Public Accountants (1999a). Assurance services alert: CPA ElderCare services – 1999. New York: American Institute of Certified Public Accountants. American Institute of Certified Public Accountants (1999b). CPA ElderCare services marketing tool kit. New York: American Institute of Certified Public Accountants. Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103(3), 411–423. Anderson, J. C., Lowe, D. J., & Reckers, P. M. J. (1993). Evaluation of auditor decisions: Hindsight bias effects and the expectation gap. Journal of Economic Psychology, 14(4), 711–738. Bagozzi, R. P. (1980). Causal models in marketing. New York: Wiley. Cronin, J. J., & Taylor, S. A. (1992). Measuring service quality: A reexamination and extension. Journal of Marketing, 56(July), 55–68. Crosby, L. A., & Stephens, N. (1987). Effects of relationship marketing on satisfaction, retention, and prices in the life insurance industry. Journal of Marketing Research, 24(November), 404–411. Epstein, M. J., & Geiger, M. A. (1994). Investor views of audit assurance: Recent evidence of the expectation gap. Journal of Accountancy (January), 60–66. Gerbing, D. W., & Anderson, J. C. (1984). On the meaning of within-factor correlated measurement errors. Journal of Consumer Research, 11(June), 572–580. Gramling, A. A., Schatzberg, J. W., & Wallace, W. A. (1996). The role of undergraduate auditing coursework in reducing the expectations gap. Issues in Accounting Education, 4(1), 131–161. Hair, J. F., Anderson, R. E., Tatham, R. L., & Black, W. C. (1992). Multivariate data analysis. New York, NY: Macmillan. Halstead, D., Hartman, D., & Schmidt, S. L. (1994). Multisource effects on the satisfaction formation process. Journal of the Academy of Marketing Science, 22(2), 114–129. Humphrey, C., Moizer, P., & Turley, S. (1993). The audit expectations gap in Britain: An empirical investigation. Accounting and Business Research (Winter), 395–411. Joreskog, K., & Sorbom, D. (1993). Lisrel 8: Structural equation modeling with SIMPLIS command language. Chicago: Scientific Software International. Kell, W. G., & Boynton, W. C. (1992). Modern auditing. New York: Wiley. Landis, R. S., Beal, D. J., & Tesluk, P. E. (2000). A comparison of approaches to forming composite measures in structural equation models. Working Paper. Lewis, G. A., Thompson, C. T., Ecklund, K. J., Popovitch, R. L., Blanco-Best, M., Roeder, C. A., Lovelace, T. W., & Hart, P. I. (1998). Guide to providing ElderCare services. Fort Worth, TX: Practitioners Publishing Company. Liggio, C. D. (1974). The expectation gap: The accountant’s Waterloo. Journal of Contemporary Business, 3(3), 27–44. Lowe, D. J. (1994). The expectation gap in the legal system: Perception differences between auditors and judges. Journal of Applied Business Research, 10(3), 39–44. Martin, L. L., Seta, J. J., & Crelia, R. A. (1990). Assimilation and contrast as a function of people’s willingness and ability to expend effort in forming an impression. Journal of Personality and Social Psychology, 59, 27–37. Miller, J. R., Reed, S. A., & Strawser, R. H. (1991). The new auditor’s report: Will it close the expectation gap in communication? CPA Journal, 60(5), 68–72.
Unraveling the Expectations Gap
193
Oliver, R. L. (1993). Cognitive, affective and attribute bases of the satisfaction response. Journal of Consumer Research, 30(December), 418–430. Oliver, R. L. (1997). Satisfaction: A behavioral perspective on the consumer. New York: McGraw-Hill. Oliver, R. L., & Bearden, W. O. (1985). Disconfirmation processes and consumer evaluations in product usage. Journal of Business Research, 13(June), 235–246. Oliver, R. L., & Swan, J. E. (1989). Equity and disconfirmation perceptions as influences on merchant and product satisfaction. Journal of Consumer Research, 16(December), 372–383. Porter, B. (1993). An empirical study of the audit-expectation-performance gap. Accounting and Business Research, 24(93), 49–68. Spector, P. E. (1992). Summated rating scale construction: An introduction. Newbury Park, CA: Sage. Tse, D. K., & Wilton, P. C. (1988). Models of consumer satisfaction formation: An extension. Journal of Marketing Research, 25(May), 204–212. Westbrook, R. A. (1987). Product/consumption-based affective responses and postpurchase processes. Journal of Marketing Research, 25(August), 258–270. Zeithaml, V. A., & Bitner, M. J. (2000). Services marketing. New York: McGraw-Hill.