ADVANCES IN BUSINESS AND MANAGEMENT FORECASTING
ADVANCES IN BUSINESS AND MANAGEMENT FORECASTING Series Editors: Kenneth D. Lawrence and Ronald K. Klimberg Recent Volumes: Volume 1: Volume 2: Volume 3: Volume 4: Volume 5: Volume 6:
Advances in Forecasting: Advances in Forecasting Advances in Forecasting Advances in Forecasting Advances in Forecasting Advances in Forecasting
Business and Management Forecasting Sales Business and Management Business and Management Business and Management Business and Management Business and Management
ADVANCES IN BUSINESS AND MANAGEMENT FORECASTING VOLUME 7
ADVANCES IN BUSINESS AND MANAGEMENT FORECASTING EDITED BY
KENNETH D. LAWRENCE New Jersey Institute of Technology, Newark, USA
RONALD K. KLIMBERG Saint Joseph’s University, Philadelphia, USA
United Kingdom – North America – Japan India – Malaysia – China
Emerald Group Publishing Limited Howard House, Wagon Lane, Bingley BD16 1WA, UK First edition 2010 Copyright r 2010 Emerald Group Publishing Limited Reprints and permission service Contact:
[email protected] No part of this book may be reproduced, stored in a retrieval system, transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without either the prior written permission of the publisher or a licence permitting restricted copying issued in the UK by The Copyright Licensing Agency and in the USA by The Copyright Clearance Center. No responsibility is accepted for the accuracy of information contained in the text, illustrations or advertisements. The opinions expressed in these chapters are not necessarily those of the Editor or the publisher. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN: 978-0-85724-201-3 ISSN: 1477-4070 (Series)
Emerald Group Publishing Limited, Howard House, Environmental Management System has been certified by ISOQAR to ISO 14001:2004 standards Awarded in recognition of Emerald’s production department’s adherence to quality systems and processes when preparing scholarly journals for print
CONTENTS LIST OF CONTRIBUTORS
ix
EDITORIAL BOARD
xiii
PART I: FINANCIAL FORECASTING TWO-DIMENSIONAL WARRANTY POLICIES INCORPORATING PRODUCT DEVELOPMENT Amitava Mitra and Jayprakash G. Patankar
3
FORECASTING THE USE OF SEASONED EQUITY OFFERINGS Rebecca Abraham and Charles Harrington
23
THE IMPACT OF LIFE CYCLE ON THE VALUE RELEVANCE OF FINANCIAL PERFORMANCE MEASURES Shaw K. Chen, Yu-Lin Chang and Chung-Jen Fu
37
FORECASTING MODEL FOR STRATEGIC AND OPERATIONS PLANNING OF A NONPROFIT HEALTH CARE ORGANIZATION Kalyan S. Pasupathy
59
PART II: MARKET FORECASTING SEASONAL REGRESSION FORECASTING IN THE U.S. BEER IMPORT MARKET John F. Kros and Christopher M. Keller v
73
vi
CONTENTS
A COMPARISON OF COMBINATION FORECASTS FOR CUMULATIVE DEMAND Joanne S. Utley and J. Gaylord May
97
CHANNEL SHARE PREDICTION IN DIRECT MARKETING RETAILING: THE ROLE OF RELATIVE CHANNEL BENEFITS Eddie Rhee
111
PREDICTING A NEW BRAND’S LIFE CYCLE TRAJECTORY Frenck Waage
121
PART III: METHODS AND PRACTICES OF FORECASTING FORECASTING PERFORMANCE MEASURES – WHAT ARE THEIR PRACTICAL MEANING? Ronald K. Klimberg, George P. Sillup, Kevin J. Boyle and Vinay Tavva FORECASTING USING FUZZY MULTIPLE OBJECTIVE LINEAR PROGRAMMING Kenneth D. Lawrence, Dinesh R. Pai and Sheila M. Lawrence A DETERMINISTIC APPROACH TO SMALL DATA SET PARTITIONING FOR NEURAL NETWORKS Gregory E. Smith and Cliff T. Ragsdale
137
149
157
PART IV: FORECASTING APPLICATIONS FORECASTING THE 2008 U.S. PRESIDENTIAL ELECTION USING OPTIONS DATA Christopher M. Keller
173
Contents
vii
RECOGNITION OF GEOMETRIC AND FREQUENCY PATTERNS FOR IMPROVING SETUP MANAGEMENT IN ELECTRONIC ASSEMBLY OPERATIONS Rolando Quintana and Mark T. Leung
183
USING DIGITAL MEDIA TO MONITOR AND FORECAST A FIRM’S PUBLIC IMAGE Daniel E. O’Leary
207
EVALUATING SURVIVAL LIKELIHOODS IN PALLIATIVE PATIENTS USING MULTIPLE CRITERIA OF SURVIVAL RATES AND QUALITY OF LIFE Virginia M. Miori and Daniel J. Miori
221
LIST OF CONTRIBUTORS Rebecca Abraham
Huizenga School of Business and Entrepreneurship, Nova Southeastern University, Fort Lauderdale, FL, USA
Kevin J. Boyle
Department of Decision and Systems Science, Haub School of Business, Saint Joseph’s University, Philadelphia, PA, USA
Yu-Lin Chang
Department of Accounting and Information Technology, Ling Tung University, Taiwan
Shaw K. Chen
College of Business Administration, University of Rhode Island, Kingston, RI, USA
Chung-Jen Fu
Department of Accounting, National Yunlin University of Science and Technology, Taiwan
Charles Harrington
Huizenga School of Business and Entrepreneurship, Nova Southeastern University, Fort Lauderdale, FL, USA
Christopher M. Keller
College of Business, East Carolina University, Greenville, NC, USA
Ronald K. Klimberg
Department of Decision and Systems Science, Haub School of Business, Saint Joseph’s University, Philadelphia, PA, USA
John F. Kros
College of Business, East Carolina University, Greenville, NC, USA ix
x
LIST OF CONTRIBUTORS
Kenneth D. Lawrence
School of Management, New Jersey Institute of Technology, North Brunswick, NJ, USA
Sheila M. Lawrence
Rutgers Business School, Rutgers University, North Brunswick, NJ, USA
Mark T. Leung
College of Business, University of Texas at San Antonio, San Antonio, TX, USA
J. Gaylord May
Wake Forest University, Winston-Salem, NC, USA
Daniel J. Miori
Palliative and Ethics Service, Millard Fillmore Gates Circle Hospital, Buffalo, NY, USA
Virginia M. Miori
Department of Decision and Systems Science, Haub School of Business, Saint Joseph’s University, Philadelphia, PA, USA
Amitava Mitra
College of Business, Auburn University, Auburn, AL, USA
Daniel E. O’Leary
Marshall School of Business, University of Southern California, Los Angeles, CA, USA
Dinesh R. Pai
Penn State Lehigh Valley, Center Valley, PA, USA
Kalyan S. Pasupathy
Health Management and Informatics, MU Informatics Institute, School of Medicine, University of Missouri, Columbia, MO, USA
Jayprakash G. Patankar Department of Management, University of Akron, Akron, OH, USA Rolando Quintana
College of Business, University of Texas at San Antonio, San Antonio, TX, USA
xi
List of Contributors
Cliff T. Ragsdale
Department of Business Information Technology, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
Eddie Rhee
Department of Business Administration, Stonehill College, Easton, MA, USA
George P. Sillup
Department of Decision and Systems Science, Haub School of Business, Saint Joseph’s University, Philadelphia, PA, USA
Gregory E. Smith
Williams College of Business, Xavier University, Cincinnati, OH, USA
Joanne S. Utley
School of Business and Economics, North Carolina A&T State University, Greensboro, NC, USA
Vinay Tavva
Department of Decision and Systems Science, Haub School of Business, Saint Joseph’s University, Philadelphia, PA, USA
Frenck Waage
University of Massachusetts at Boston, Boston, MA, USA
EDITORIAL BOARD EDITORS-IN-CHIEF Kenneth D. Lawrence New Jersey Institute of Technology
Ronald Klimberg Saint Joseph’s University
SENIOR EDITORS Lewis Coopersmith Rider College
Virginia Miori Saint Joseph’s University
John Guerard Anchorage, Alaska
Daniel O’Leary University of Southern California
Douglas Jones Rutgers University
Dinesh R. Pai The Pennsylvania State University
John J. Kros East Carolina University
William Stewart College of William and Mary
Stephen Kudbya New Jersey Institute of Technology
Frenck Waage University of Massachusetts
Sheila M. Lawrence Rutgers University
David Whitlark Brigham Young University
xiii
PART I FINANCIAL FORECASTING
TWO-DIMENSIONAL WARRANTY POLICIES INCORPORATING PRODUCT DEVELOPMENT Amitava Mitra and Jayprakash G. Patankar ABSTRACT Some consumer durables, such as automobiles, involve warranties involving two attributes. These are time elapsed since the sale of the product and the usage of the product at a given point in time. Warranty may be invoked by the customer if both time and usage are within the specified warranty parameters and product failure occurs. In this chapter, we assume that usage and product age are related through a random variable, the usage rate, which may have a certain probabilistic distribution as influenced by consumer behavior pattern. Further, product failure rate is influenced by the usage rate and product age. Of importance to the organization is to contain expected warranty costs and select appropriate values of the warranty parameters accordingly. An avenue to impact warranty costs is through research on product development. This has the potential to reduce the failure rate of the product. The objective then becomes to determine warranty parameters, while constraining the sum of the expected unit warranty costs and research and development (R&D) costs per unit sales, under a limited R&D budget.
Advances in Business and Management Forecasting, Volume 7, 3–22 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007004
3
4
AMITAVA MITRA AND JAYPRAKASH G. PATANKAR
INTRODUCTION A majority of consumer products provide some sort of assurance to the consumer regarding the quality of the product sold. This assurance, in the form of a warranty, is offered at the time of sale. The Magnuson–Moss Warranty Act of 1975 (US Federal Trade Commission Improvement Act, 1975) also mandates that manufacturers must offer a warranty for all consumer products sold for more than $15. The warranty statement assures consumers that the product will perform its function to their satisfaction up to a given amount of time (i.e., warranty period) from the date of purchase. Manufacturers offer many different types of warranties to promote their products. Thus, warranties have become a significant promotional tool for manufacturers. Warranties also limit the manufacturers’ liability in the case of product failure beyond warranty period. Taxonomy of the different types of warranty policies may be found in the work of Blischke and Murthy (1994). Considering warranty policies that do not involve product development after sale, policies exist for a single item or for a group of items. With our focus on single items, policies may be subdivided into the two categories of nonrenewing and renewing. In a renewing policy, if an item fails within the warranty time, it is replaced by a new item with a new warranty. In effect, warranty beings anew with each replacement. However, for a nonrenewing policy, replacement of a failed item does not alter the original warranty. Within each of these two categories, policies may be subcategorized as simple or combination. Examples of a simple policy are those that incorporate replacement or repair of the product, either free or on a pro rata basis. The proportion of the warranty time that the product was operational is typically used as a basis for determining the cost to the customer for a pro rata warranty. Given limited resources, management has to budget for warranty repair costs and thereby determine appropriate values of the warranty parameters of, say, time and usage. Although manufacturers use warranties as a competitive strategy to boost their market share, profitability, and image, they are by no means cheap. Warranties cost manufacturers a substantial amount of money. The cost of a warranty program must be estimated precisely and its effect on the firm’s profitability must be studied. Manufacturers plan for warranty costs through the creation of a fund for warranty reserves. An estimate of the expected warranty costs is thus essential for management to plan for warranty reserves. For the warranty policy considered, we assume that the product will be repaired if failure occurs within a specified time and the
Two-Dimensional Warranty Policies Incorporating Product Development
5
usage is less than a specified amount. Such a two-dimensional policy is found for products such as automobiles where the warranty coverage is provided for a time period, say five years, and a usage limit of, say, 50,000 miles. In this chapter, we assume minimal repair, that is, the failure rate of the product on repair remains the same as just before failure. Further, the repair time is assumed to be negligible. In this chapter, we consider the aspect of expenditures on research and development (R&D) to improve a product. Improvement of a product occurs through a variety of means, some of which could be improved design, improved processes, improved labor and equipment, or improved raw material, among others. While R&D expenditures may have an impact on the short run on reducing net revenue, there is a greater benefit when the long-term objectives of an organization are considered. A major impact of R&D is a reduction in the failure rate of the product. With a better product, the warranty costs associated with products that fail within a prescribed warranty time or usage will be lower. This may lead to an increase in the net revenue, whereby the increase in R&D expenditures per unit sales is more than offset by the decrease in the expected warranty costs per unit sales.
LITERATURE REVIEW Research on estimation of warranty costs has been studied extensively for about four decades. One of the earliest papers by Menke (1969) estimated expected warranty costs for a single sale for a linear pro rata and lump-sum rebate plans for nonrenewable policies. Blischke and Scheuer (1975) considered the costs associated with the free replacement and pro rata policy under different time-to-failure distributions and later applied renewal theory (Blischke & Scheuer, 1981) to estimate warranty costs for two types of renewable warranty policies. Other researchers have also used renewal theory (Blacer & Sahin, 1986; Frees & Nam, 1988; Mamer, 1982; Mamer, 1987) to estimate warranty costs for various warranty policies. A good review of the various warranty policies is found in Blischke and Murthy (1992). Murthy and Blischke (1992a) provide a comprehensive framework of analyses in product warranty management and further conduct a detailed review of mathematical models (Murthy & Blischke, 1992b) in this research area. A thorough treatment of warranty cost models and analysis of specific types of warranty policies, along with operational and engineering aspects of product warranties, is found in Blischke and Murthy (1994). The vast literature in warranty analysis is quite disjoint.
6
AMITAVA MITRA AND JAYPRAKASH G. PATANKAR
A gap exists between researchers from different disciplines. With the objective of bridging this gap, Blischke and Murthy (1996) provided a comprehensive treatise of consumer product warranties viewed from different disciplines. In addition to providing a history of warranty, the handbook presents topics such as warranty legislation and legal actions; statistical, mathematical, and engineering analysis; cost models; and the role of warranty in marketing, management, and society. Murthy and Djamaludin (2002) provided a literature review of warranty policies for new products. As each new generation of product usually increases in complexity to satisfy consumer needs, customers are initially uncertain about its performance and may rely on warranties to influence their product choice. Additionally, servicing of warranty, whether to repair or replace the product by a new one, influences the expected cost to the manufacturer (Jack & Murthy, 2001). A different slant on studying the effect of imperfect repairs on warranty costs has been studied by Chukova, Arnold, and Wang (2004). Here, repairs are classified according to the depth of repair or the degree to which they restore the ability of the item to function. Huang and Zhuo (2004) used a Bayesian decision model to determine an optimal warranty policy for repairable products that undergo deterioration with age. Wu, Lin, and Chou (2006) considered a model for manufacturers to determine optimal price and warranty length to maximize profit, based on a chosen life cycle, for a free renewal warranty policy. Huang, Liu, and Murthy (2007) developed a model to determine the parameters of product reliability, price, and warranty strategy that maximize integrated profit for repairable products sold under a free replacement/repair warranty strategy. Another angle of approach to reduce warranty costs is the concept of burnin of the product, where products are operated under accelerated stress for a short time period before their release to the customer. A study of optimal burn-in time and warranty length under various warranty policies is found in Wu, Chou, and Huang (2007). A warranty strategy that combines a renewing free-replacement warranty with a pro rata rebate policy is found in Chien (2008). In a competitive market place as the twenty-first century, products are being sold with long-term warranty policies. These are in the forms of extended warranty, warranty for used products, service contracts, and lifetime warranty policies. Since lifespan in these policies are not well-defined, modeling of failures and costs are complex (Chattopadhyay & Rahman, 2008). The majority of past research has dealt with a single-attribute warranty policy, where the warranty parameter is typically the time since
Two-Dimensional Warranty Policies Incorporating Product Development
7
purchase of the product. Singpurwalla (1987) developed an optimal warranty policy based on maximization of expected utilities involving both profit and costs. A bivariate probability model involving time and usage as warranty criteria was incorporated. One of the first studies among twodimensional warranty policies using a one-dimensional approach is that by Moskowitz and Chun (1988). Product usage was assumed to be a linear function of the age of the product. Singpurwalla and Wilson (1993, 1998) modeled time to failure, conditional on total usage. By choosing a distribution for total usage, they derived a two-dimensional distribution for failure using both age and usage. Singpurwalla (1992) also considered modeling survival in a dynamic environment with the usage rate changing dynamically. Moskowitz and Chun (1994) used a Poisson regression model to determine warranty costs for two-dimensional warranty policies. They assumed that the total number of failures is Poisson distributed whose parameter can be expressed as a regression function of age and usage of a product. Murthy, Iskander, and Wilson (1995) used several types of bivariate probability distributions in modeling product failures as a random point process on the two-dimensional plane and considered free-replacement policies. Eliashberg, Singpurwalla, and Wilson (1997) considered the problem of assessing the size of a reserve needed by the manufacturer to meet future warranty claims in the context of a two-dimensional warranty. They developed a class of reliability models that index failure by two scales, such as time and usage. Usage is modeled as a covariate of time. Gertsbakh and Kordonsky (1998) reduced usage and time to a single scale, using a linear relationship. Ahn, Chae, and Clark (1998) used a similar concept using a logarithmic transformation. Chun and Tang (1999) found warranty costs for a two-attribute warranty model by considering age and usage of the product as warranty parameters. They provided warranty cost estimation for four different warranty policies (rectangular, L-shaped, triangular, and iso-cost) and performed sensitivity analysis on discount rate, usage rate, and warranty terms to determine their effects on warranty costs. Kim and Rao (2000) considered a two-attribute warranty model for nonrepairable products using a bivariate exponential distribution to explain item failures. Analytical expressions for warranty costs are derived using Downtone’s bivariate distribution. They demonstrate the effect of correlation between usage and time on warranty costs. A two-dimensional renewal process is used to estimate warranty costs. Wang and Sheu (2001) considered the effect of warranty costs on optimization of the economic manufacturing quality (EMQ). As a process
8
AMITAVA MITRA AND JAYPRAKASH G. PATANKAR
deteriorates over time, it produces defective items that incur reworking costs (before sale) or warranty repair costs (after sale). The objective of their paper was to determine the lot size that will minimize total cost per unit of time that includes set-up cost, holding cost, inspection cost, reworked cost, and warranty costs. Sensitivity analysis is performed on various costs to determine an optimum production lot size. Yeh and Lo (2001) explored the effect of preventive maintenance actions on expected warranty costs. A model is developed to minimize such costs. Providing a regular preventive maintenance within the warranty period increases maintenance cost to the seller, but the expected warranty cost is significantly reduced. An algorithm is developed that determines an optimal maintenance policy. Lam and Lam (2001) developed a model to estimate expected warranty costs for a warranty that includes a free repair period and an extended warranty period. Consumers have an option to renew warranty after the free repair period ends. The choice of consumers has a significant effect on the expected warranty costs and determination of optimal warranty policy. Maintenance policies during warranty have been considered by various authors (Jack & Dagpunar, 1994; Dagpunar & Jack, 1994; Nguyen & Murthy, 1986). Some consider the repair/replacement policy following expiration of the warranty. Dagpunar and Jack (1992) consider the situation where, if the product fails before the warranty time, the manufacturer performs minimal repair. In the event of product failure after the warranty time, the consumer bears the expenses of either repairing or purchasing a new product. Sahin and Polatoglu (1996) study two types of replacement policies on expiration of warranty. In one policy, the consumer applies minimal repair for a fixed period of time and replaces the unit with a new one at the end of this period, while in the second policy the unit is replaced at the time of the first failure following the minimal repair period. Thomas and Rao (1999) provide a summary of warranty economic decision models. In the context of two-dimensional warranty, Chen and Popova (2002) study a maintenance policy which minimizes total expected servicing cost. An application of a two-dimensional warranty in the context of estimating warranty costs of motorcycles is demonstrated by Pal and Murthy (2003). Majeske (2003) used a general mixture model framework for automobile warranty date. Rai and Singh (2003) discussed a method to estimate hazard rate from incomplete and unclear warranty data. A good review of analysis of warranty claim data is found in Karim and Suzuki (2005).
Two-Dimensional Warranty Policies Incorporating Product Development
9
Research Objectives In this chapter, we consider a two-dimensional warranty policy where the warranty parameters, for example, could be time and usage at the point of product failure. A warranty policy in this context, such as those offered for automobiles, could be stated as follows: product will be replaced or repaired free of charge up to a time (W) or up to a usage (U), whichever occurs first from the time of the initial purchase. Warranty is not renewed on product failure. For example, automobile manufacturers may offer a 36 months or 36,000 miles warranty, whichever occurs first. For customers with high usage rates, the 36,000 miles may occur before 36 months. On the contrary, for those with limited usage, the warranty time period of 36 months may occur first. Fig. 1 shows a two-dimensional warranty region. We assume that the usage is related to time as a linear function through the usage rate. To model a variety of consumers, usage rate is assumed to be a random variable with a specified probability distribution. This chapter develops a model based on minimal – repair or replacement of failed items. In this chapter, we develop a model from the manufacturer’s perspective. We consider the aspect of product development. Through advances in R&D of products as well as processes, the failure rate of the product may be impacted. This may cause a reduction in the expected warranty costs due to
Fig. 1.
Two-Dimensional Warranty Region.
10
AMITAVA MITRA AND JAYPRAKASH G. PATANKAR
the lower failure rate. By incorporating the sum of the R&D expenditures per sales dollar along with the expected warranty costs per sales dollar as the objective function, the problem is to determine the parameters of a warranty policy that minimizes the above objective function. The manufacturer typically has an idea of the upper and lower bounds on the price, warranty time, usage, and unit R&D expenditures. Optimal parameter values are determined based on these constraints.
MODEL DEVELOPMENT The following notation is used in the chapter: W U R t Y(t) X(t) l(t|r) N(W,U|r) c cs RD
Warranty period offered in warranty policy Usage limit offered in warranty policy Usage rate Instant of time Usage at time t Age at time t Failure intensity function at time t given R ¼ r Number of failures under warranty given R ¼ r Unit product price Unit cost of repair or replacement R&D expenditures per unit sales
Relationship between Warranty Attributes We assume that the two attributes, say time and usage, are related linearly through the usage rate, which is a random variable. Denoting Y(t) to be the usage at time t and X(t) the corresponding age, we have YðtÞ ¼ RXðtÞ,
(1)
where R is the usage rate. It is assumed that all items that fail within the prescribed warranty parameters are minimally repaired and the repair time is negligible. In this context, X(t) ¼ t.
Two-Dimensional Warranty Policies Incorporating Product Development
11
Distribution Function of Usage Rate To model a variety of customers, R is assumed to be a random variable with probability density function given by g(r). The following distribution functions of R are considered in this chapter: (a) R has a uniform distribution over (a1, b1): This models a situation where the usage rate is constant across all customers. The density function of R is given by 1 ; a1 r b1 b1 a 1 ¼ 0; otherwise:
gðrÞ ¼
(2)
(b) R has a gamma distribution function: This may be used for modeling a variety of usage rates among the population of consumers. The shape of the gamma distribution function is influenced by the selection of its parameters. When the parameter, p, is equal to 1, it reduces to the exponential distribution. The density function is given by gðrÞ ¼
er rp1 ; 0 ro1; p40. GðpÞ
(3)
Failure Rate Failures are assumed to occur according to a Poisson process where it is assumed that failed items are minimally repaired. If the repair time is small, it can be approximated as being zero. Since the failure rate is unaffected by minimal repair, failures over time occur according to a nonstationary Poission process with intensity function l(t) equal to the failure rate. As discussed previously, expenditures on R&D will create an improved product with a reduction in the failure rate. Conditional on the usage rate R ¼ r, let the failure intensity function at time t be given by lðtjrÞ ¼ y0 þ y1 r þ ðy2 þ y3 rÞt a5 RD.
(4)
(1) Stationary Poisson process: Under this situation, the intensity function l(t|r) is a deterministic quantity as a function of t when y2 ¼ y3 ¼ 0. This applies to many
12
AMITAVA MITRA AND JAYPRAKASH G. PATANKAR
electronic components that do not deteriorate with age and failures are due to pure chance. The failure rate in this case is constant. (2) Nonstationary Poisson process: This models the more general situation where the intensity function changes as a function of t. It is appropriate for products and components with moving parts where the failure rate may increase with time of usage. In this case y2 and y3 are not equal to zero.
Expected Warranty Costs The warranty region is the rectangle shown in Fig. 1, where W is the warranty period and U the usage limit. Let g1 ¼ U|W. Conditional on the usage rate R ¼ r, if the usage rate rZg1, warranty ceases at time Xr, given by Xr ¼
U . r
(5)
Alternatively, if rog1, warranty ceases at time W. The number of failures under warranty, conditional on R ¼ r, is given by Z W NðW; UjrÞ ¼ lðtjrÞ dt; if rog1 t¼0 (6) Z X r
lðtjrÞ dt; if r g1 .
¼ t¼0
The expected number of failures is thus obtained from Z g1 Z W lðtjrÞ dt gðrÞdr E½NðW; UÞ ¼ r¼0 t¼0 Z 1 Z X r þ lðtjrÞ dt gðrÞdr. r¼g1
(7)
t¼0
Expected warranty costs (EWC) per unit are, therefore, given by EWC ¼ cs E½NðW; UÞ,
(8)
whereas the expected warranty costs per unit sales (ECU) are obtained from c s E½NðW; UÞ. (9) ECU ¼ c
Two-Dimensional Warranty Policies Incorporating Product Development
13
We now develop an expression for the average failure rate (lave) that is influenced by R&D expenditures. We have Z g1 Z W lave ¼ lðtjrÞ dt gðrÞ dr t¼0 r¼0 (10) Z 1 Z X r lðtjrÞ gðrÞ dr. þ r¼g1
t¼0
The unit product price is impacted by the average failure rate, which is given by c ¼ a4 þ
b4 , lave
(11)
where a4 and b4 are appropriate constants. Accordingly, the unit cost of repair or replacement is obtained from cs ¼ a3 þ b3 c,
(12)
where a3 and b3 are appropriate constants.
Mathematical Model We first consider the constraints that must be satisfied for the decision variables of product price, warranty time, and warranty usage limit. A manufacturer having knowledge of the unit cost of product and R&D expenditures, and a desirable profit margin, can usually identify a minimum price, below which it would not be feasible to sell the product. Similarly, knowing the competition, it has a notion of the maximum price that the product should be priced at. Using a similar rationale, a manufacturer might be able to specify minimum and maximum bounds on the warranty time and usage limit to be offered with the product. Furthermore, the organization will have some knowledge to set minimum and maximum bounds on the unit R&D expenditures. So, the constraints on the policy parameters are c1 c c2 ; W 1 W W 2; U1 U U2; d 1 RD d 2 ;
(13)
14
AMITAVA MITRA AND JAYPRAKASH G. PATANKAR
where c1 is the minimum product price, c2 the maximum product price, W1 the minimum warranty period, W2 the maximum warranty period, U1 the minimum usage limit, U2 the maximum usage limit, and d1 and d2 the minimum and maximum bound on RD, respectively. The objective function, to minimize, is the sum of the expected warranty costs and R&D expenses per unit sales. Hence, the model becomes Minimize ðECU þ RDÞ,
(14)
subject to the set of constraints given by (13).
RESULTS The application of the proposed model is demonstrated through some sample results using selected values of the model parameters. The complexity of calculating E[N(W,U)], given by Eq. (7), influences the calculation of ECU, given by Eq. (9), which ultimately impacts the objective function given by Eq. (14). Closed form solutions for E[N(W,U)], in general cases, are usually not feasible. Hence, numerical integration methods are used. Further, the optimal values of the objective function are not guaranteed to be globally optimum. Two distributions are selected for the usage rate, R. One being the uniform distribution between (0,6), while the second being the gamma distribution with parameter p ¼ 2, 4. For the failure rate intensity function, conditional on R, the selected parameters are y0 ¼ 0.005; y1 ¼ 2, 5; y2 ¼ 0.05; y3 ¼ 0.05. Based on the chosen value of y1, the value of the parameter a5, which demonstrates the impact of RD on the failure rate, is selected accordingly. For y1 ¼ 2, a5 is selected to be 1.9; while for y1 ¼ 5, a5 is selected to be 4.9. Note that the failure rate cannot be negative, hence an appropriate constraint is placed when determining feasible solutions. To study the stationary case, y2 and y3 are selected to be 0. The unit product price, based on the average failure rate, is found using the parameter values of a4 ¼ 1.0 and b4 ¼ 0.02. Similarly, the unit cost of repair or replacement is found using a3 ¼ 0.25 and b3 ¼ 0.2. Bounds on the warranty policy parameters are as follows: unit product price between $10,000 and $40,000 (c1 ¼ 1, c2 ¼ 4); warranty period between 2 and 10 years (w1 ¼ 2, w2 ¼ 10); and usage limit between 50,000 and 120,000 miles (U1 ¼ 5, U2 ¼ 12). For the unit expenditures on R&D (RD) per unit sales, the bounds are selected as 0.01 and 2 (d1 ¼ 0.01, d2 ¼ 2), respectively.
Two-Dimensional Warranty Policies Incorporating Product Development
Fig. 2.
15
Lambda Average (lave) Versus U for Different Values of W for Uniform Distribution of R.
The behavior of the average failure rate (lave) as a function of the usage limit (U) and the warranty time limit (W), for a given unit R&D expenditures per unit sales (RD) is shown in Fig. 2. The distribution of the usage rate is assumed to be uniform with the failure rate being stationary, and RD ¼ 0.1. As expected, lave increases with U, for a given W. Further, the average failure rate for large values of W dominates those for smaller values of W. For large values of W (W ¼ 10, 8), the average failure rate increases by more than twofold, over the range of U. For small values of W (W ¼ 2), the increase is less than 50%. In the chosen range of U, lave appears to increase linearly for large values of W. However, for small values of W, lave tapers off for large values of U. A similar behavior is observed in Fig. 3, where the distribution of the usage rate is gamma, the failure rate is stationary, and RD ¼ 0.1. However, in this situation, the increase in the average failure rate is not as much, relative to the uniform distribution of usage rate. For small values of W (W ¼ 2), the increase in the average failure rate is minimal and it approaches its asymptotic value, as a function of U, rather quickly. The ECU function is also studied as a function of the warranty parameters W and U and the unit R&D expenditures per unit sales (RD).
16
Fig. 3.
Fig. 4.
AMITAVA MITRA AND JAYPRAKASH G. PATANKAR
Lambda Average (lave) Versus U for Different Values of W for Gamma Distribution of R.
Expected Warranty Costs per Unit Sales for Uniform Distribution of R.
Fig. 4 shows the ECU function for various values of W, for RD ¼ 0.5. The distribution of the usage rate is uniform with the failure rate being stationary. As expected, ECU increases with U, for a given W. For large values of W (W ¼ 10, 8, 6), certain small values of U are not feasible.
Two-Dimensional Warranty Policies Incorporating Product Development
Fig. 5.
17
Expected Warranty Costs per Unit Sales for Gamma Distribution of R.
Also, ECU for large values of W dominates those for smaller values of W. For large values of W, ECU increases by more than two-fold, over the range of U. For small values of W (W ¼ 2), the increase in ECU is about 33%. In the selected range of U, ECU seems to increase linearly for large values of W. However, for small values of W, increase in ECU tapers off with an increase in U. A similar behavior is observed in Fig. 5, where the distribution of the usage rate is gamma (p ¼ 2), the failure rate is stationary, and RD ¼ 0.5. The increase in the ECU function over the chosen rage of U is smaller than that compared with the usage rate distribution being uniform with a stationary failure rate. The increase in the ECU function is more asymptotic than linear, for large values of W, as observed in Fig. 4. Further, for small values of W (W ¼ 2), the rate of increase is much smaller than that compared with the usage rate distribution being uniform. This suggests that, depending on the type of consumer (as estimated by the usage rate distribution), the manufacturer could offer differing levels of the warranty parameters (U and W), while maintaining the expected warranty costs per unit sales to be restricted within certain bounds. Fig. 6 shows the ECU values as a function of U for W ¼ 2, for different values of RD. The chosen distribution of usage rate is uniform with the failure rate being stationary. The impact of the variable RD can be characterized from this graph. Obviously, for larger values of RD, ECU is smaller compared with smaller values of RD over the entire range of the warranty parameters. Interestingly, the ECU values taper off asymptotically for large values of U, for all chosen values of RD. For large values
18
Fig. 6.
AMITAVA MITRA AND JAYPRAKASH G. PATANKAR
Expected Warranty Costs per Unit Sales for Uniform Distribution of R and Various values of RD.
Table 1. Distribution of U Uniform Uniform Gamma (p ¼ 2) Gamma (p ¼ 2)
Optimal Warranty Policy Parameters.
Failure Rate
c
W
U
RD
ECU þ RD
Stationary Nonstationary Stationary Nonstationary
1.022 1.009 1.401 1.062
2.000 2.000 2.000 2.000
5.000 5.000 7.623 7.642
2.000 2.000 2.000 2.000
2.877 2.970 2.016 2.139
of RD (RD ¼ 2.0), the increase in ECU is marginal, as a function of U. When considering the total objective function of (ECU þ RD), it can be seen that this function could be smaller for large values of RD (say RD ¼ 2.0), when the total objective function approaches a value slightly below 4.0, even for large values of W. However, for small values of RD (say RD ¼ 0.01), the total objective function approaches a value above 5.0. Table 1 shows some results on the optimal warranty policy parameters of unit price, warranty time, and usage limit as well as R&D expenditures per unit sales. The objective function value of the sum of expected warranty costs and R&D costs, per unit sales, is also shown. The parameter values discussed previously are used, with y1 ¼ 2 and a5 ¼ 1.9. From Table 1, it is observed that spending the maximum permissible amount on R&D expenditures per unit sales leads to minimization of the
Two-Dimensional Warranty Policies Incorporating Product Development
19
total warranty and R&D expenditures per unit sales. Obviously, the choice of the selected model parameters will influence this decision. Further, of the three warranty policy parameters of unit price, warranty, and usage limit, it seems that unit price and warranty time are more sensitive to the impact on the objective function. With the goal being to minimize the objective function, the optimal values of unit price and warranty time are close to their respective lower bounds. Some flexibility is observed in the optimal values of the usage limit.
CONCLUSIONS A two-dimensional warranty has been considered. With the concept of product development in mind, the impact of unit R&D expenditures has been incorporated in a model. The policy parameters are the warranty time, usage limit, unit product price, as well as the unit R&D expenditures per unit sales. It is well known that expected warranty costs increase with the parameters of warranty time and usage limit. However, through expenditures on R&D, the failure rate of the product may be reduced. Such a reduction in the failure rate may reduce the expected warranty costs per unit sales. Hence, the objective function that is considered is the sum of the expected warranty costs and the R&D expenditures, per unit sales. It is desirable to minimize this combined objective function subject to constraints on the policy parameters. Several possibilities exist for future research in this area. One could involve estimation of the distribution of usage rate of customers, based on the availability of data from prior customers. Second, the impact of simultaneous product development of competitors could also be an avenue for exploration. The manufacturer could be impacted by the degree of product improvement offered by competitors. This, in turn, may force the manufacturer to accomplish certain desired features of the product. For example, the manufacturer may have to improve the average failure rate of the product to below a chosen level. The problem then becomes determination of the unit R&D expenditures, along with warranty policy parameters, to offer a competitive product as well as a competitive warranty policy.
REFERENCES Ahn, C. W., Chae, K. C., & Clark, G. M. (1998). Estimating parameters of the power law process with two measures of failure rate. Journal of Quality Technology, 30, 127–132.
20
AMITAVA MITRA AND JAYPRAKASH G. PATANKAR
Blacer, Y., & Sahin, I. (1986). Replacement costs under warranty: Cost moments and time variability. Operations Research, 34, 554–559. Blischke, W. R., & Murthy, D. N. P. (1992). Product warranty management – I: A taxonomy for warranty policies. European Journal of Operations Research, 62, 127–148. Blischke, W. R., & Murthy, D. N. P. (1994). Warranty cost analysis. New York: Marcel Dekker, Inc. Blischke, W. R., & Murthy, D. N. P. (Eds). (1996). Product warranty handbook. New York: Marcel Dekker, Inc. Blischke, W. R., & Scheuer, E. M. (1975). Calculation of the warranty cost policies as a function of estimated life distributions. Naval Research Logistics Quarterly, 22(4), 681–696. Blischke, W. R., & Scheuer, E. M. (1981). Applications of renewal theory in analysis of free-replacement warranty. Naval Research Logistics Quarterly, 28, 193–205. Chattopadhyay, G., & Rahman, A. (2008). Development of lifetime warranty policies and models for estimating costs. Reliability Engineering & System Safety, 93(4), 522–529. Chen, T., & Popova, E. (2002). Maintenance policies with two-dimensional warranty. Reliability Engineering and System Safety, 77, 61–69. Chien, Y. H. (2008). A new warranty strategy: Combining a renewing free-replacement warranty with a rebate policy. Quality and Reliability Engineering International, 24, 807–815. Chukova, S., Arnold, R., & Wang, D. Q. (2004). Warranty analysis: An approach to modeling imperfect repairs. International Journal of Production Economics, 89(1), 57–68. Chun, Y. H., & Tang, K. (1999). Cost analysis of two-attribute warranty policies based on the product usage rate. IEEE Transactions on Engineering Management, 46(2), 201–209. Dagpunar, J. S., & Jack, N. (1992). Optimal repair-cost limit for a consumer following expiry of a warranty. IMA Journal of Mathematical Applications in Business and Industry, 4, 155–161. Dagpunar, J. S., & Jack, N. (1994). Preventive maintenance strategy for equipment under warranty. Microelectron Reliability, 34(6), 1089–1093. Eliashberg, J., Singpurwalla, N. D., & Wilson, S. P. (1997). Calculating the warranty reserve for time and usage indexed warranty. Management Science, 43(7), 966–975. Frees, E. W., & Nam, S. H. (1988). Approximating expected warranty cost. Management Science, 43, 1441–1449. Gertsbakh, I. B., & Kordonsky, K. B. (1998). Parallel time scales and two-dimensional manufacturer and individual customer warranties. IIE Transactions, 30, 1181–1189. Huang, H. Z., Liu, Z. J., & Murthy, D. N. P. (2007). Optimal reliability, warranty and price for new products. IIE Transactions, 39, 819–827. Huang, Y. S., & Zhuo, Y. F. (2004). Estimation of future breakdowns to determine optimal warranty policies for products with deterioration. Reliability Engineering & System Safety, 84(2), 163–168. Jack, N., & Dagpunar, J. S. (1994). An optimal imperfect maintenance policy over a warranty period. Microelectron Reliability, 34(3), 529–534. Jack, N., & Murthy, D. N. P. (2001). Servicing strategies for items sold with warranty. Journal of Operational Research, 52, 1284–1288. Karim, M. R., & Suzuki, K. (2005). Analysis of warranty claim data: A literature review. International Journal of Quality & Reliability Management, 22(7), 667–686. Kim, H. G., & Rao, B. M. (2000). Expected warranty cost of two-attribute free replacement warranties based on a bivariate exponential distribution. Computers and Industrial Engineering, 38, 425–434.
Two-Dimensional Warranty Policies Incorporating Product Development
21
Lam, Y., & Lam, P. K. W. (2001). An extended warranty policy with options open to the consumers. European Journal of Operational Research, 131, 514–529. Majeske, K. D. (2003). A mixture model for automobile warranty data. Reliability Engineering and System Safety, 81, 71–77. Mamer, J. W. (1982). Cost analysis of pro rata and free-replacement warranties. Naval Research Logistics Quarterly, 29(2), 345–356. Mamer, J. W. (1987). Discounted and per unit costs of product warranty. Management Science, 33(7), 916–930. Menke, W. W. (1969). Determination of warranty reserves. Management Science, 15(10), 542–549. Moskowitz, H., & Chun, Y. H. (1988). A Bayesian approach to the two-attribute warranty policy. Paper No. 950. Krannert Graduate School of Managaement, Purdue University, West Lafayette, IN. Moskowitz, H., & Chun, Y. H. (1994). A Poisson regression model for two-attribute warranty policies. Naval Research Logistics, 41, 355–376. Murthy, D. N. P., & Blischke, W. R. (1992a). Product warranty management – II: An integrated framework for study. European Journal of Operations Research, 62, 261–281. Murthy, D. N. P., & Blischke, W. R. (1992b). Product warranty management – III: A review of mathematical models. European Journal of Operations Research, 62, 1–34. Murthy, D. N. P., & Djamaludin, I. (2002). New product warranty: A literature review. International Journal of Production Economics, 79, 231–260. Murthy, D. N. P., Iskander, B. P., & Wilson, R. J. (1995). Two dimensional failure free warranty policies: Two dimensional point process models. Operations Research, 43, 356–366. Nguyen, D. G., & Murthy, D. N. P. (1986). An optimal policy for servicing warranty. Journal of the Operational Research Society, 37, 1081–1098. Pal, S., & Murthy, G. S. R. (2003). An application of Gumbel’s bivariate exponential distribution in estimation of warranty cost of motorcycles. International Journal of Quality & Reliability Management, 20(4), 488–502. Rai, B., & Singh, N. (2003). Hazard rate estimation from incomplete and unclean warranty data. Reliability Engineering and System Safety, 81, 79–92. Sahin, I., & Polatoglu, H. (1996). Maintenance strategies following the expiration of warranty. IEEE Transactions on Reliability, 45(2), 220–228. Singpurwalla, N. D. (1987). A strategy for setting optimal warranties. Report TR-87/4. Institute for Reliability and Risk Analysis, School of Engineering and Applied Science, George Washington University, Washington, D.C. Singpurwalla, N. D. (1992). Survival under multiple time scales in dynamic environments. In: J. P. Klein & P. K. Goel (Eds), Survival analysis: State of the art (pp. 345–354). Singpurwalla, N. D., & Wilson, S. P. (1993). The warranty problem: Its statistical and game theoretic aspects. SIAM Review, 35, 17–42. Singpurwalla, N. D., & Wilson, S. P. (1998). Failure models indexed by two scales. Advances in Applied Probability, 30, 1058–1072. Thomas, M. U., & Rao, S. S. (1999). Warranty economic decision models: A summary and some suggested directions for future research. Operations Research, 47, 807–820. US Federal Trade Commission Improvement Act. (1975). 88 Stat 2183, pp. 101–112.
22
AMITAVA MITRA AND JAYPRAKASH G. PATANKAR
Wang, C.-H, & Sheu, S.-H. (2001). The effects of the warranty cost on the imperfect EMQ model with general discrete shift distribution. Production Planning and Control, 12(6), 621–628. Wu, C. C., Chou, C. Y., & Huang, C. (2007). Optimal burn-in time and warranty length under fully renewing combination free replacement and pro-rata warranty. Reliability Engineering & System Safety, 92(7), 914–920. Wu, C. C., Lin, P. C., & Chou, C. Y. (2006). Determination of price and warranty length for a normal lifetime distributed product. International Journal of Production Economics, 102, 95–107. Yeh, R. H., & Lo, H. C. (2001). Optimal preventive maintenance warranty policy for repairable products. European Journal of Operational Research, 134, 59–69.
FORECASTING THE USE OF SEASONED EQUITY OFFERINGS Rebecca Abraham and Charles Harrington ABSTRACT Seasoned equity offerings (SEOs) are sales of stock after the initial public offering. They are a means to raise funds through the sale of stock rather than the issuance of additional debt. We propose a method to predict the characteristics of firms that undertake this form of financing. Our procedure is based on logistic regression where firm-specific variables are obtained from the perspective of the firm’s need to raise cash such as high debt ratios, high current liabilities, reduction and changes in current debt, significant increase in capital expenditure, and cash flows in terms of cash as a percentage of assets.
Seasoned equity offerings (SEOs), more descriptively termed secondary equity offerings, are the issue of stock by a firm that has already completed a primary issue. From a capital structure perspective, a firm can raise longterm funds by using internal financing if it has the funds available. Given the likelihood that internal funds may be insufficient to meet long-term needs for new product development, expansion of facilities, or research and development investment, all of which require significant amounts of capital, raising funds, from external sources becomes the only viable alternative. This may take the form of borrowing from financial institutions
Advances in Business and Management Forecasting, Volume 7, 23–36 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007005
23
24
REBECCA ABRAHAM AND CHARLES HARRINGTON
(acquiring debt), or issuing common stock through a seasoned equity offering to existing or new shareholders (selling equity). This chapter is directed toward forecasting the likelihood that a firm would choose equity. The SEO research is limited. Only a few studies (Masulis & Korwar, 1986; Mikkelson & Partch, 1986) investigate the reasons for using SEOs as a means of external funding. Others focus on a the single variable as a determinant of the SEO alternative. For example, Hull and Moellenberndt (1994) examined bank debt reductions, Hull (1999) the failure to meet industry debt standards, and Johnson, Serrano, and Thompson (1996) the ability to capitalize on investment opportunities. We suggest that it is a complex interplay of factors that determine the SEO choice decision, particularly the availability of debt, current cash flow, and investment opportunities so that any analysis must consider the simultaneous effect of all three groups of variables. Cash flow considerations, in particular, have been omitted from the above studies. Why would a firm choose equity over debt? The tax deductibility of the interest on debt renders debt the cheaper source of capital and does not result in the dilution of ownership as would be the case if additional shares are issued to new stockholders. Myers and Majluf (1984) theorized that managers have privileged information about the firm. They are aware of its cash flows, its retention of earnings, sales prospects and the need for capital and research expenditure. If managers act rationally and have the firm’s best interests at heart, they will invest in positive NPV projects and raise firm value. The amount of capital for investment in these projects may have to be obtained externally; excessive debt may alarm existing shareholders given that the tax deductibility of interest on debt is substantially offset by the risk of financial distress and bankruptcy in the event that the firm’s future cash flows are insufficient to meet fixed payments of principal and interest. That future cash flows may be insufficient is a real concern given the uncertainty of the current economic environment. In other words, multiple signals influence the choice of financing, negative signals from the escalation of the risk of financial distress from use of debt, positive signals from the tax benefit of debt and the lack of dilution of ownership, positive signals from management’s prudent undertaking of projects, and negative signals from management submitting to pressure from existing shareholders not to issue stock. Ambarish, John, and Williams (1987) concluded that positive signals dominate in favor of issuing additional debt, empirically documenting positive announcement effects from seasoned equity issues.
Forecasting the Use of Seasoned Equity Offerings
25
REVIEW OF THE LITERATURE Information asymmetry is at the cornerstone of the financing decision. By definition, it is the examination of transactions in which there is an imbalance of information, with one party to the transaction having more valuable information that has the potential to influence the outcome. Managers have inside information on day-to-day performance which motivates them to select the optimal method of financing. The question becomes, how the information advantage may be gleaned by outsiders. Such denouement of management intentions was referred to by Stigler (1960), the originator of the concept, as screening. The uninformed party (investors and us, in this case) may use observations of the behavior of the informed party to close the imbalance in information by evaluating the choices of managers, which were based on their private information. Walker and Yost (2008) attempted to accomplish this goal by observing the financial performance of firms following SEO announcement. Like us, they recognized the need to incorporate variables that measure diverse motivations for selecting SEOs, namely, debt reduction, capitalizing on investment opportunities, and general operational reasons, particularly declining performance. However, they measured these effects on an ex ante basis in terms of future SEO performance after announcement, in terms of both financial statement information and statements made by management at announcement. This study measures SEO motivations ex post, before announcement takes place using financial statements only as we maintain that there are issues of response bias in selfreport measures. We also wish to update and extend their sample which consisted of 2 years of pre-2001 data during a term rapid economic expansion to suit the slower growth of the current era. We approach the issue from a forecasting perspective in which we use a large sample based on the entire Compustat database that meets our criteria, instead of confining our analysis to just firms that made SEO announcements as we wish to use firm characteristics to predict the likelihood of SEO offering. Walker and Yost (2008) observed that expansion was the dominant goal for firms so that those with high levels of debt concentrated on capitalizing on growth opportunities rather than debt reduction. Any debt retirement consisted of paying off old debt contracts and acquiring significant levels of new debt of up to 50–90% of total capital. Both operating cash flow and liquidity declined in the two years following SEO announcement, suggesting that internal operational factors may have played a role in motivating management to select SEOs. However, as the data was obtained ex ante it is possible that prevailing conditions following announcement confounded the
26
REBECCA ABRAHAM AND CHARLES HARRINGTON
results so that business conditions after announcement made expansion the primary objective over debt reduction or that a sudden decline in operating performance could have occurred independently of the SEO financing decision. Operating cash flow was measured by operating income before depreciation. As depreciation tax shields provide a major impetus to firms seeking the purchase of new capital equipment, thereby increasing the level of investment in equipment, their exclusion could lead to the overstatement of operating income. Overstatements of operating income lessen the likelihood of SEO choice as general financial health of the firm appears unduly optimistic. Liquidity was measured by the ratio of net working capital to total assets. The rationale was that net working capital or the extra funds from liquid sources after payment of current debts declined following announcement so that such firms use lack of liquidity as a criterion in their choice of SEOs. We prefer to focus on cash flow mainly as cash is the most liquid of all current assets. Net working capital includes accounts receivable and inventory, which are less liquid assets than cash. Accounts receivable typically takes 90 days to be liquidated, if liquidated in less time it is due to factoring which involves significant losses. Inventory is the least liquid of the current assets with goods remaining unsold for months, so that their ultimate conversion to cash is certainly not timely, but may even be questionable. Further, we take the position that multiple measures of liquidity are necessary as they reveal different aspects of cash flow, and provide a fuller picture of a firm’s liquidity position. We supplement overall cash flow measures with cash flow investing to cover unexpected expenses during expansion, and cash flow financing to explain reductions in cash flow to cover dividend payouts.
HYPOTHESIS DEVELOPMENT If a firm needs to raise additional funds, it is apparent that this need emerges from a perceived need for cash in the immediate future. Therefore, the first group of variables are cash flow variables.
Cash Flow The first source of cash flow is income, which is the net profit of the business after payment of all expenses, interest, and taxes. If the firm generates sufficient income, it would have the funds needed to meet all of its current
Forecasting the Use of Seasoned Equity Offerings
27
expenses and reinvest retained earnings in the firm. Therefore, the change in retained earnings would forecast the need to generate funds externally. If retained earnings continue to increase in conjunction with debt, it appears that the firm has exhausted its internal source of funds and needs a seasoned equity offering.
Cash Flow Investing If the firm has rapidly rising capital expenditures, it may be involved in a major expansion. This could take the form of investing in foreign markets, expanding production for the domestic market, or new product development for either market. The change in capital expenditure should act as an explanatory variable in determining the likelihood of seasoned equity offerings.
Cash Flow Financing Cash flow financing refers to the methods of disbursement of idle cash generated by business operations. The first payout is dividends. Initial and subsequent dividend announcements send a strong positive signal about the financial health of the firm in that they disseminate information to the investing public that the firm is financially strong enough to sustain the distribution of cash to its shareholders, and that it wishes shareholders to benefit from the continued success of the firm (John, Koticha, Narayanan, & Subrahmanyam, 2000). Rising dividend payouts during economic prosperity and the maintenance of dividends at a stable level during economic downturns bolster investor confidence are likely to be employed by firms who feel that investors have the confidence to continue investing through seasoned equity offerings.
Long-Term Debt Reduction Our position is that firms that engage in a plan of long-term debt reduction to reduce the threat of financial distress and bankruptcy. Such firms do not wish to return to dependence of debt and may therefore be likely to seek additional funding from equity sources.
28
REBECCA ABRAHAM AND CHARLES HARRINGTON
Changes in Current Debt Coupled with long-term debt reduction are changes in current debt. Falling current debt indicates the desire to forego debt as a source of financing. Given that internal financing may be insufficient, equity becomes the only alternative.
BALANCE SHEET VARIABLES The balance sheet, by definition, indicates the financial position of a firm at a particular point in time. The most important variable may be total assets. Given that a seasoned equity offering will only be attractive to investors who have sufficient confidence that their funds will be used wisely, it is highly plausible that they will seek large, visible firms who will disseminate sufficient information about their future expansion and investment plans. Only such firms will have stock that is liquid enough to be traded regularly and in sufficient quantities to enable funds to be raised for significant capital expenditures. Small firms have little collateral value and cannot raise funds easily through equity due to high issuance costs and lack of credibility (Myers & Majluf, 1984). If any funds are raised through equity, it is due to their inability to obtain sources of debt funding as documented by Fama and French (2002). We will use total assets as a discriminator by excluding the lowest 75 percent of firms as being unable to raise funds due to lack of perceived liquidity. Other balance sheet variables that merit consideration include cash and short-term investments, investment and advances, current liabilities, and long-term debt.
Cash and Short-Term Investments Cash and short-term investments refers to the cash balance in a demand deposit account as well as investments in marketable securities which consist of short-term bills and stocks held for a 1–3 month duration that act as interest-earning repositories for idle cash. Declining levels of cash as a percentage of total assets indicate short-term needs for cash usually to meet high interest or other fixed payments such as leases of capital equipment or debt repayment. Such firms are less likely to increase fixed payments through increased dependence on debt and would opt for equity financing.
Forecasting the Use of Seasoned Equity Offerings
29
Long-Term Debt An annual increase in long-term debt would be detrimental to the firm from the perspective of controlling risk. Debt is inherently risky in that it imposes restrictions of the use of future cash flows. Firms that show rapid increases in debt are less likely to choose further debt financing and will choose seasoned equity offerings.
Common Equity Common equity acts as a proxy for retained earnings. As common equity includes both capital stock and retained earnings. An increase in retained earnings could mean that more funds are being generated by the business and possibly there is less need for external funding in the form of an SEO offering. However, it is more realistic to consider an increase in retained earnings as an indicator of greater reinvestment capability and more interest by management to promote the growth and future development of the enterprise. In such cases, more ambitious capital investment projects will be undertaken, possibly those that involve the creation of new products and markets. Such projects may be too risky for traditional financial institutions limiting the amount of capital to be raised through debt. In such cases, equity becomes the preferred investment choice. The final category consists of income statement variables. Income statement variables may not be as useful as balance sheet variables in gauging external funding sources as they tend to have a short-term focus on quarterly rather than long-term results. However, the case can be made for the value of examining net income and capital expenditure.
Net income This is the single measure of final profitability of the firm. Firms with rising net incomes are successful firms with products that continue to find customers and managers who are committed to the long-term success of the enterprise. They do not engage in agency conflicting behaviors that promote individual self-interest at the expense of the firm’s prospects, therefore, such firms continue to be profitable year after year, so that any expansions that they undertake from external funding are fully justifiable to shareholders and the public. Such firms would not wish to see a decline in
30
REBECCA ABRAHAM AND CHARLES HARRINGTON
income by raising funds through debt as interest expense would reverse the trend of rising net income. Further, foregoing potentially profitable products would not be a choice as such managers are focused on continually raising profits.
Capital Expenditures An increase in capital expenditures represents an increasing in funding new equipment, research expenditure, new product development, and new market expansion. In keeping with Walker and Yost (2008), we employ the measure of capital expenditure/total assets which includes research expenditure. Rising capital expenditures mandate the need for significant capital spending on new projects with uncertain potential. Financial institutions may be reluctant to finance such projects so that SEOs become the major source of investment capital. The above discussion leads to the following hypotheses: The probability of use of seasoned equity offerings increases with H1. The combined effect of a rise in retained earnings and debt, H2. An increase in capital expenditure, H3. An increase in dividends, H4. Long-term debt reduction, H5. The decrease in current debt, H6. The decline in levels of cash and short-term investments, H7. The increase in long-term debt: This hypothesis is the alternative to H4, H8. The rise in net income.
METHODOLOGY The entire Computstat North America database of 10,000 stocks was screened to arrive at a sample of stocks with SEO potential. Using total assets as the discriminator, firms that had total assets in the 95th percentile were isolated. As stated only large, visible firms were considered to be
Forecasting the Use of Seasoned Equity Offerings
31
possible SEO candidates. Four years of annual financial statement data for each of these firms was extracted including data from 2002 to 2005. This ensured predictive accuracy based upon the effect of normal market conditions without the confounding effect of the economic downturn of 2007–2009. They included retained earnings, long-term debt, capital expenditure, dividends, long-term debt reduction, change in current debt, cash and short-term investment, net income, interest expense, and operating income after depreciation. Each variable was scaled by total assets to account for variations in size of the firm. Asset size was used to estimate the probability of SEO offering, with a dichotomous variable taking on values of 0 and 1 being used to indicate if the firm had a probability of SEO offering (score of 1) or no probability of SEO offering (score of 0). The data was subjected to the following logistic regression: PðSEO offeringÞ ¼ a þ b1 RE þ b2 LTD þ b3 CE þ b4 D þ b5 DR þb6 CD þ b7 C þ b8 NI
(1)
where RE is the retained earnings measured by common equity, CE the capital expenditure, LTD the long-term debt, D the dividends, DR the debt reduction, CD the current debt, C the cash and short-term investments, NI the net income, and the P(SEO offering) a dichotomous variable based on asset size. Firms which had asset sizes greater than the mean were designated values of ‘‘1’’ while those with asset sizes less than the mean were assigned values of ‘‘0.’’
RESULTS Annual observations over years 2002–2005 for 300 separate stocks in the final sample with the highest likelihood of being selected for seasoned equity offerings were subjected to a logistic regression with the probability of selection as dependent variable and capital expenditure, cash and short-term investments, debt reduction, current debt, dividends, long-term debt, and net income as independent variables. The final model used 1228 observations with 1031 correct cases thereby accurately predicting the probability of SEO offerings with 83.96 percent accuracy. As shown by Table 1, Hypothesis 1 was partly supported with the decline in both common equity but no significant reduction in debt as the coefficient for change in common equity was a significant 2.07 105, po.01 and that for debt reduction was a non significant 1.697 105, pW.1. Hypothesis 2
32
Table 1.
REBECCA ABRAHAM AND CHARLES HARRINGTON
Results of Logistic Regression of the Probability of Selecting Seasoned Equity Offerings on Firm Characteristics.
Variable Capital expenditure Common equity Cash and short-term investments Debt reduction Current debt Dividends Long-term debt Net income Percent accuracy Average likelihood Pseudo R2
Coefficient 2.388 104 2.079 105 1.107 105 1.697 105 4.612 106 1.495 104 1.378 104 2.874 104 83.96 0.576 0.225
t-Ratio 4.53, 2.73, 2.69, 1.56, 1.78, 1.33, 9.28, 8.29,
p ¼ .0000 p ¼ .006 p ¼ .007 p ¼ .117 p ¼ .074 p ¼ .184 p ¼ .00 p ¼ .00
po.05, po.01, po.001.
was supported contrary to the hypothesized direction as the reduction in capital expenditure led to an increased probability of choosing SEOs as a method of financing (coefficient ¼ 2.388 104, po.001). Hypothesis 3 was not supported. Firms that pay higher dividends as a percentage of assets are unlikely to seek SEOs as a method of financing (b ¼ 1.495 104, pW.1). Hypothesis 4 was not supported; there was no significant reduction in debt (coefficient of 1.697 105, pW.1). We may conclude that longterm debt reduction does not significantly influence the selection of firms for seasoned equity offerings. Hypothesis 5 was supported at the .1 level of significance (b ¼ 4.61 106, po.1), though not at the more stringent .05 level of significance (b ¼ 4.61 106, p ¼ .07). The reduction in current debt marginally increases the likelihood of selection for seasoned equity offerings. Hypothesis 6 was supported contrary to the hypothesized direction. Rising levels of cash and short-term investments, or a strong liquidity position, was associated with the likelihood of opting for seasoned equity offerings as the preferred method of financing (coefficient ¼ 1.107 105, po.01). Hypothesis 7 was not supported; as Hypothesis 7 is the alternative to Hypothesis 4, the question is which of them is supported, a decrease (Hypothesis 4) or an increase (Hypothesis 7) to which the response is neither as there was no significant effect of the reduction in debt on the probability of selection for an SEO. The increase in net income was associated strongly with the choice of seasoned equity offerings (coefficient ¼ 2.874 104, po.001) supporting Hypothesis 8 (Table 2).
33
Forecasting the Use of Seasoned Equity Offerings
Table 2.
Descriptive Statistics for Firm Characteristics.
Capital Expenditure (N ¼ 1228) Mean Variance 25th percentile Median 75th percentile Maximum Minimum Common equity (N ¼ 1228) Mean Variance 25th percentile Median 75th percentile Maximum Minimum
1461 824404.00 0 444.5 1608.25 33274 0
Skewness Kurtosis
4.98 37.70
11085 2.4904 108 1911 6445 12823.25 111412 0
Skewness Kurtosis
3.12 12.18
Skewness Kurtosis
6.33 47.44
Cash and short-term investments (N ¼ 1228) Mean 9250.59 Variance 9.113 108 25th Percentile 207.75 Median 1462.5 75th Percentile 4612.25 Maximum 339136 Minimum 0 Debt reduction (N ¼ 1228) Mean Variance 25th percentile Median 75th percentile Maximum Minimum
4104.33 3.21 108 0 617 2248.00 280684 0
Skewness Kurtosis
10.24 120.78
Current debt (N ¼ 1228) Mean Variance 25th percentile Median 75th percentile Maximum Minimum
19379.97 2.94 109 1315.25 5626 12287.75 542569 0
Skewness Kurtosis
6.09 43.79
34
REBECCA ABRAHAM AND CHARLES HARRINGTON
Table 2. (Continued ) Capital Expenditure (N ¼ 1228) Dividends (N ¼ 1228) Mean Variance 25th Percentile Median 75th Percentile Maximum Minimum Long-Term Debt (N ¼ 1228) Mean Variance 25th percentile Median 75th percentile Maximum Minimum Net income (N ¼ 1228) Mean Variance 25th Percentile Median 75th Percentile Maximum Minimum
486.60 1168108.98 0 107 501 9352 0
Skewness Kurtosis
4.67 26.52
4179.23 75399151.77 0 0 5619.25 105502 –24615
Skewness Kurtosis
5.15 41.76
1335.12 9397749.18 0 623 1743 24521 –25780
Skewness Kurtosis
1.13 19.59
CONCLUSIONS AND RECOMMENDATIONS FOR FUTURE RESEARCH The sum total of all of the hypotheses indicates that firms are strong fundamentally are more likely to select seasoned equity offerings as a method of financing. Such firms have rising net incomes suggesting that they produce profitable products targeted at growing markets, either domestically or internationally. They do not necessarily pay high dividends and do not use a large amount of common equity to fund expansion. As common equity declines, they rely to an increasing extent on retained earnings or internal financing. They are averse to relying on financial leverage to fund expansion. The attitude toward debt is apparent in that they have declining levels of existing debt, or old debt that is in the process of being retired, without a firm policy of debt reduction, whereby they aggressively pay-off
Forecasting the Use of Seasoned Equity Offerings
35
existing debt. They maintain high and strengthening cash balances which reduce exposure in uncertain economic times and provide a cushion of capital in an economic downturn. This suggests a high level of conservatism even prior to the economic downturn of 2007–2009. Another relevant result is the decline in capital expenditures being associated with the probability of selection for seasoned equity offerings. At first glance, this may seem puzzling, given the implicit assumption that firms seek more expensive equity funding to finance capital projects. However, we need to be aware of the fact that our measure is of current capital expenditure. Perhaps these firms have reached their limit and are facing diminishing returns on current capital investment projects. Given their rising net incomes, it is likely that they wish to fund new, innovative projects with uncertain profit potential so that they do not wish to raise debt and choose equity as the optimal method of financing. This study adds to the existing body of literature on the characteristics of firms that undertake seasoned equity offerings. Together with Walker and Yost (2008), it provides the only body of knowledge that seeks multiple characteristics to explain the choice of seasoned equity for financing. In addition, it is both contemporary and complete. The data set is very current using data from the post-2002 time period. It provides for a longer span of data than Walker and Yost (2008) who employed two years of data versus four in this study. Our examination of the entire Compustat database with a full 10,000 list of stocks makes this study uniquely comprehensive. Future research should consider operating performance as a determinant of seasoned equity offerings. The particularly relevant variable in this case is operating income after depreciation. Firms that invest in research and development by purchasing new equipment are able to write off significant amounts of this new cost as depreciation expense. This depresses their operating income after depreciation. As research and development expenditures continue to rise, it is likely that there will come a point at which a rapidly expanding firm will be unable to meet its research and development expenditures from internal funds. Debt would reduce the level of internal funding, so that equity financing in the form of SEOs offers the more attractive alternative. Another method of confirming the growing trend toward foregoing debt as a means of financing would be to use interest expense as a determinant of SEO offerings. The arguments against the use of debt apply to raising interest expense. Interest expense places a burden on operations and pressure on managers to generate sufficient income from operations to meet fixed payments. Declining interest expense is an indicator of declining dependence on debt and the increasing probability of relying upon new equity.
36
REBECCA ABRAHAM AND CHARLES HARRINGTON
In summary, this chapter adds to the literature on capital structure by focusing on the an area in which there is a paucity of research, i.e., on the firm characteristics that underlie the selection of stocks for seasoned equity offerings by offering a comprehensive approach to forecasting the prevalence of such offerings.
REFERENCES Ambarish, R., John, K., & Williams, J. (1987). Efficient signaling with dividends and investments. Journal of Finance, 42, 321–344. Fama, E., & French, K. (2002). Testing trade-off and pecking order predictions about dividends and debt. Review of Financial Studies, 15, 1–33. Hull, R. (1999). Leverage ratios, industry norm, and stock price reaction: An empirical investigation. Financial Management, 28, 32–45. Hull, R., & Moellenberndt, R. (1994). Bank debt reduction announcements and negative signaling. Financial Management, 23, 21–30. John, K., Koticha, A., Narayanan, R., & Subrahmanyam, M. (2000). Margin rules, informed trading in derivatives, and price dynamics. Working Paper, New York University. Johnson, D., Serrano, J., & Thompson, G. (1996). Seasoned equity offerings for new investments. The Journal of Financial Research, 19, 91–103. Masulis, R., & Korwar, A. (1986). Seasoned equity offerings: An empirical investigation. Journal of Financial Economics, 15, 31–60. Mikkelson, W., & Partch, M. (1986). Valuation effects of security offerings and the issuance process. Journal of Financial Economics, 15, 31–60. Myers, S., & Majluf, N. (1984). Corporate financing and investment decisions when firms have information that investors do not have. Journal of Financial Economics, 13, 187–221. Stigler, G. J. (1960). The economics of information. Journal of Political Economy, 69, 213–225. Walker, M. D., & Yost, K. (2008). Seasoned equity offerings: What firms say, do, and how the market reacts. Journal of Corporate Finance, 14, 376–386.
THE IMPACT OF LIFE CYCLE ON THE VALUE RELEVANCE OF FINANCIAL PERFORMANCE MEASURES Shaw K. Chen, Yu-Lin Chang and Chung-Jen Fu ABSTRACT The components of earnings or cash flows have different implications for the assessment of the firm’s value. We extend the research for valuerelevant fundamentals to examine which financial performance measures convey more information to help investors evaluate the performance and value for firms in different life cycle stages in the high-tech industry. Six financial performance measures are utilized to explain the difference between market value and book value. Cross-sectional data from firms in Taiwanese information electronics industry are used. We find all the six performance measures which are taken from Income Statement and Cash Flow Statement are important value indicators but the relative degrees of value relevance of various performance measures are different across the firm’s life cycle stages. The empirical results support that capital markets react to various financial performance measures in different life cycle stages and are reflected on the stock price.
Advances in Business and Management Forecasting, Volume 7, 37–58 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007006
37
38
SHAW K. CHEN ET AL.
INTRODUCTION Any financial variable is considered to be value relevant if it has a predictable association with market values of equity. A sizable literature suggests that financial measures can provide value-relevant information for investors and that the components of earnings or cash flows have different implications for the assessment of firm value. On the other hand, different life cycle stages constitute an important contingency factor in the development of the organizational theory of companies (Koberg, Uhlenbruck, & Sarason, 1996). Firms in different life cycle stages have different economic characteristics that may affect the usefulness and value relevance of financial performance measures (Black, 1998; Baginski, Lorek, & Branson, 1999). Thus, it is important to consider the impact of the life cycle stage on the value relevance of the components of earnings and cash flows. Impacts of different life cycle stages are critical for evaluating performance (Hofer, 1975; Robinson, 1998; Pashley & Philippatos, 1990; Robinson & McDougall, 2001). Jorion and Talmor (2001) state that although generally accepted accounting principles (GAAP) are designed for all companies to provide comparable financial reports, the usefulness of accounting information may vary according to the changes in production function and activities of firms. Porter (1980) suggests that regardless of a firm’s strategies (e.g., sales growth or utilization of capital capacity), the impacts of the life cycle stage of the company should be considered. Robinson (1998) shows that the financial performance of companies that enter a market in the start-up stage is better than the companies that enter a market in the mature stage. His results support the proposition that it is very important to consider the impact of life cycle when evaluating the financial performance measures of firms. Fairfield, Sweeney, and Yohn (1996) suggest that reported earnings alone may not transmit all the information in accounting data for evaluating firm profitability. Black (1998) further analyzes the life cycle impacts on the incremental value relevance of earnings and cash flow measures. Because the components of earnings or cash flows have different implications for the assessment of firm value, this study extends the research for value-relevant fundamentals by examining which financial performance measures convey more information when evaluating. We use cross-sectional data over twelve years yields from a total of 4,862 firm-year observations to examine six financial performance measures for firms of different life cycle stages in the Taiwanese high-tech industry. The earning measures are decomposed into research and development expense (R&D), operating
The Value Relevance of Financial Performance Measures
39
income (OI), and adjusted non-operating income (ANOI). The cash flow measures are decomposed into cash flows from operating (CFO), cash flows from investing (CFI), and cash flows from financing (CFF). We find all the six performance measures are important value indicators but the relative degrees of value relevance are different across firm life cycle stages. For earnings-related measures, R&D and ANOI are the best valuerelevant indicators in the growth stage. OI is more value-relevant in the mature than in the decline stage. On the other hand, for cash flow measures, CFI and CFF are both better value indicators in the decline stage compared with the growth stage. This implies that the market does concern itself with the various financial performance measures in different life cycle stages and reflects it on the stock price. Our result supports the Financial Accounting Standard Board (FASB) suggestion that present and potential investors can rely on accounting information to improve investment, credit, and similar predictive decisions. Specifically, our findings detail the value relevance of different financial performance measures across life cycle stages of information in electronic firms, a critical and highly competitive industry for Taiwan’s economy. The remainder of this chapter is organized as follows. The second section presents the literature review and derives hypotheses from past research and the third section shows our sample selection procedure, life cycle classification, definition and measurement of the variables, and the valuation model. The empirical results and analyses are discussed in the fourth section. Further discussion on our finding and a brief conclusion are provided in the final section.
LITERATURE REVIEW AND HYPOTHESES Financial performance measures have been most widely used to evaluate the organizational health of a firm. However, the opportunities, pressures, and threats in both the external and internal environment of an organization vary with the stages of life cycle (Anderson & Zeithaml, 1984; Jawahar & Mclaughlin, 2001). Myers (1977) asserts that firm value can be analyzed from two components: assets in place and future growth opportunities. In early life cycle stages, growth opportunities constitute a larger component of firm value; but in the later life cycle stages, assets in place become the larger component. The information conveyed by the financial performance measures is expected to be different for each component. Since the proportion of these two value components differs in each life cycle stage,
40
SHAW K. CHEN ET AL.
the value relevance of financial performance measures is expected to vary by stages (Anthony & Ramesh, 1992; Black, 1998; Hand, 2005). Due to the change in the combinations of activities and investments, there is a shift in the value relevance of certain financial statements line items (Jorion & Talmor, 2001). Penman (2001) also indicates that information transmitted by various cash flows typically varies over the life cycle of the firm. We expect the information conveyed by the financial performance measures is expected to be positively associated with firm value in different life cycle stages and then develop related hypotheses across the different life cycle stages.
The Value Relevance of Financial Performance Measures in the Start-Up Stage Prior literature states that the start-up stage is characterized by market growth, less intense competition, heavy capital investment requirement, new technology development, and high prices. Dodge, Fullerton, and Robbins (1994) and Jawahar and Mclaughlin (2001) indicate that financing and marketing problems are perceived as crucial for organizational survival. Thus, a firm’s important concerns are obtaining the initial capital to solve financing problems and enter the market in this stage. Anderson and Zeithaml (1984) indicate that during this stage, primary demand for the product begins to grow, and products are unfamiliar for potential customers. Robinson (1998) and Robinson and McDougall (2001) suggests that sales growth is more appropriate than the market share as a performance measure in the start-up stage. Therefore, sales growth is necessary for firms to survive. As sales growth increases, the value of the firm will increase in the start-up stage.1
The Value Relevance of Financial Performance Measures in the Growth Stage R&D is a critical element in the production function of information electronic firms. Booth (1998) and Hand (2003, 2005) find that a firm with continuous research and development will have a higher ratio of hidden value in its firm value and supports the proposition that R&D is beneficial to the future development of the firm. With regard to a research-intensive firm, Joos (2002) finds that when R&D increases, return-on-equity (ROE) will also increase. Relative to the later two life cycle stages, firms in the
The Value Relevance of Financial Performance Measures
41
growth stage will spend more on R&D leading to future profit-generating opportunities and increase firm value. This discussion leads to the following hypothesis: H1a. Relative to the mature and decline stages, R&D is more positively associated with firm value for firms in the growth stage. In the growth stage, most firms can’t generate much income from core operating activities. Those who can obtain more income from nonoperating activities will create more value for shareholders. Fairfield et al. (1996) suggest that nonoperating income has incremental predictive content of future profitability. Therefore, in this stage, the firm’s adjusted nonoperating income (ANOI) is positively associated with firm value and we propose the following hypothesis: H1b. Relative to the mature and decline stages, ANOI is more positively associated with firm value for firms in the growth stage.
The Value Relevance of Financial Performance Measures in the Mature Stage Robinson (1998) indicates that operating income (OI) is a superior measure to reflect a firm’s ability to sell its products and to evaluate operational performance. Relative to the growth and decline stages, OI is a more important earnings item; for firms in the mature stage would alter its focus from increasing sales growth to increasing OI. As OI increases, the value of the firm will increase. Thus we test the following hypothesis: H2a. Relative to the growth and decline stages, OI is more positively associated with firm value for firms in the mature stage. Klein and Marquardt (2006) views CFO as a financial measure of the firm’s real performance. In the mature stage, firms are often characterized by generating sufficient cash flows, without particularly attractive investment opportunities (Jawahar & Mclaughlin, 2001; Penman 2001). Black (1998) also finds that CFO is positively associated with market value of the firm in the mature stage. Therefore, as CFO increases, the value of the firm will increase, and leads to our next hypothesis: H2b. Relative to the growth and decline stages, CFO is more positively associated with firm value for firms in the mature stage.
42
SHAW K. CHEN ET AL.
The Value Relevance of Financial Performance Measures in the Decline Stage Business risk is high when firms move into the decline stage. When demand for an organization’s traditional products and services are reduced, less efficient firms are forced out of industries and market (Konrath, 1990; Pashley & Philippatos, 1990). Following the BCG (Boston Consulting Group) model, when companies step into the decline stage, the cash generate ability becomes a more important value driver for those companies or divisions in the decline stages. The ability to generate funds from the outside affects the opportunity for firms’ continuous operations and improvement. Therefore, as CFF increases, the value of the firm will increase, and leads to our next hypothesis: H3a. Relative to the growth and mature stages, CFF is more positively associated with firm value for firms in the decline stage. Declining firms do not necessarily fail.2 Some firms are forced to consider investment and develop new products and technology to ensure organizational survival (Kazanjian, 1988; Jawahar & Mclaughlin, 2001; Zoltners, Sinha, & Lorimer, 2006). Black (1998) points out that firms can regenerate by investing in new production facilities and innovative technology and goes back into the growth or mature stage, prevent failure for many years. CFI is expected to be positively correlated with firm value during decline (Black, 1998). Pashley and Philippatos (1990) also have a similar conclusion. Therefore, we develop the following hypothesis: H3b. Relative to the growth and mature stages, CFI is more positively associated with firm value for firms in the decline stage.
RESEARCH DESIGN AND METHODOLOGY Sample Selection We selected publicly listed information electronics companies from the Taiwan Stock Exchange (TSE) and Gre Tai Securities Market (Taiwan OTC market). The companies’ financial data and the equity market value data are obtained from the Financial Data of Company Profile of the Taiwan Economic Journal (TEJ) Data Bank. A total of 4862 firm-year observations over twelve years from 1997 to 2008 are collected.
The Value Relevance of Financial Performance Measures
43
The criteria for sample selection are: (1) Sample firms are limited to information electronics industries; (2) Companies with any missing stock price or financial data are excluded; and (3) Companies subject to full-delivery settlements and the de-listed companies are excluded. We focus our attention on the information electronics industry for two reasons: (1) we hope to derive less noisy competitive structure variables (such as barrier-to-entry, concentration, and market share) (Joos, 2002) to mitigate some problems caused by using cross-sectional studies (Ittner, Larcker, & Randall, 2003) and (2) the information electronic industry is a strategically critical sector for Taiwan’s economic prosperity and growth. Life Cycle Classification Classifying companies into different life cycle stages is a challenging task. This study uses a tailored classification method similar to Anthony and Ramesh (1992) and Black (1998). A multivariate classification method is used to classify observations into three life cycle stages. The procedure of life cycle classification is as follows. First, we choose sales growth, capital expenditures, dividend payout, and firm age as the classification indicators. Second, sales growth and capital expenditure are sorted from highest to lowest, while dividend payout and firm age are sorted from lowest to highest by rank. The indicators are given a score of 0, 1, or 2 based on their ranking. The firm with the highest sales growth or capital expenditures is given a score 0; or else, the firm with the highest dividend payout or firm age is given a score 2. The scores of the four classification indicators are then summed together giving a composite score that ranges from zero to eight. Finally, firm years are assigned to one of the three groups based on the composite score. Firm-year observations with composite score equal or less than 2 are assigned to the growth stage. Firm years with a composite score three, four, or five are assigned to the mature stage. And, firm-year observations with composite score equal or more than 6 are assigned to the decline stage. At last, our final sample retains 1070 firm-year observations in growth stage, 2730 firm-year observations in mature stage, and 1062 firm-year observations in decline stage. Empirical Model We extend Ohlson’s (1995) valuation model to examine the relationship between equity market value and the various financial performance
44
SHAW K. CHEN ET AL.
measures in each life cycle stage. We add a dummy variable (STAGE) based on the life cycle stage and build two-way interaction terms among various financial variables to test our hypotheses. Definition and measurement of variables are given in Table 1. The extended empirical model is as follows. MVi ¼ b0 þ b1 BVCi þ b2 CFOi þ b3 CFOi STAGEi þ b4 CFIi þ b5 CFIi STAGEi þ b6 CFFi þ b7 CFFi STAGEi þ b8 R&Di þ b9 R&Di STAGEi þ b10 OIi þ b11 OIi STAGEi þ b12 ANOIi þ b13 ANOIi SATGEi þ i where MV is the market value of equity, BVC the book value of net assets except for cash, CFO the cash flows from operating, CFI the cash flows from investing, CFF the cash flows from financing, R&D the R&D expense, OI the operating income, ANOI the djusted nonoperating income, BVCSTAGE the interaction term for BVC and STAGE, CFOSTAGE the interaction term for cash flows from operating and STAGE, CFISTAGE the interaction term for cash flows from investing and STAGE, CFFSTAGE the interaction term for Cash flows from financing and STAGE, R&DSTAGE the interaction term for R&D expense and STAGE, OISTAGE the interaction term for operating income and STAGE, ANOISTAGE the interaction term for adjusted nonoperating income and STAGE, and STAGE the life cycle stage.
ANALYSIS OF EMPIRICAL RESULTS Descriptive Statistics and Correlation Analysis Table 2 provides the descriptive information on the variables of different life cycle stages. As shown in Table 2, the mean of MV is 22,280,000, 17,983,380, and 11,049,000 thousand New Taiwanese Dollars (NTD), respectively. The mean of R&D decreases from 309,501 to 256,245 thousand NTD. This implies that firms generally reduce research and development expenditures as they decline over time. The mean of CFI (cash outflows) decreases from 2,850,576 to 528,939 thousand NTD and is consistent with firm’s investment opportunity sets becoming smaller as firms decline. CFF provides information about the ability of a firm’s asset in place to generate cash to pay-off existing debt, or acquire additional funds for the firm (Pashley & Philippatos, 1990). The mean of CFF also decreases from
The Value Relevance of Financial Performance Measures
Table 1. Variables
45
Definition and Measurement of Variables. Measurement
A. Life cycle classification indicator variables 100(salestsalest1)/salest1 Sales growth (SGit) Dividend payout (DPit) 100annual dividend of common stock/annual income Capital expenditures (CEit) 100(purchase fixed assets – reevaluated fixed assets of firm i at time t)/AVt1 Firm age The difference between the current year and the year which the firm was originally formed Life cycle stage (STAGEit) Dummy variable in the three groups (growth stage compared with mature stage; mature stage compared with decline stage; and decline stage compared with growth stage) and take on the value of 1 for the former life cycle stage and 0 for the latter life cycle stage, B. Variables of the empirical model Market value of equity The market value of equity of firm i at time t/AVt1 (MVit) Cash flows from operating Cash flows from operating activities of firm i at time t/AVt1 (CFOit) Cash flows from investing activities of firm i at time t/AVt1 Cash flows from investing (CFIit) Cash flows from financing Cash flows from financing activities of firm i at time t/AVt1 (CFFit) R&D expense (R&Dit) R&D expense of the firm i at time t/AVt1 Operating income (OIit) (Gross profit – operating expenses ) of firms i at time t/AVt1 Net income (NIit) The net income of firms i at time t/AVt1 (NIitOIitR&Dit)/AVt1 Adjusted nonoperating income (ANOIit) Control variable Book value of net assets except for cash (BVCit)
The book value of equity less the change in the cash account of firm i at time t/AVt1 (Black, 1998)
Notes: All of the variables are deflated by each year by the book value of assets at the end of year t1(AVt1). All of the financial variables are measured in thousand dollars.
1,439,503 to 652,183 thousand NTD. This is also consistent with our inference that firms require more external funds during the decline stage. Table 3 shows the interrelations among the various financial performance measures and MV in different life cycle stages. We find correlation among variables is significant. For example, in the growth stage, MV is highly positively correlated with R&D (Pearson r ¼ 0.361; po0.01), in the mature stage, MV is highly positively correlated with CFO (Pearson r ¼ 0.338; po0.01), and in the decline stage, MV is highly positively correlated with
7,207,595 733,486 23,979 425,778,584 28,350,024
Panel C: Decline stage Mean 11,049,000 Media 1,864,000 Minimum 49,000 Maximum 1,545,626,000 Standard deviation 65,756,911 1,199,253 185,805 18,821,154 211,949,947 9,063,193
1,508,180 157,816 7,668,056 196,080,297 9,330,262
1,782,532 145,541 4,353,322 96,485,947 7,867,416
CFO
528,939 57,540 73,057,733 12,254,894 3,631,809
1,060,463 115,808 150,875,235 17,399,958 6,138,540
2,850,576 337,595 127,287,435 1,522,523 10,336,019
CFI
652,183 97,635 115,013,456 22,026,509 5,181,927
297,089 20,708 135,893,492 38,581,045 4,919,306
1,439,503 211,757 24,101,046 53,682,439 5,907,868
CFF
Summary Descriptive Statistics.
256,245 42,393 0 19,737,038 1,083,259
298,826 55,618 0 15,913,834 979,253
309,501 72,022 0 11,725,035 879,612
R&D
425,678 100,813 39,542,273 106,290,232 5,035,696
914,317 128,346 18,070,807 126,299,859 5,546,756
1,043,012 204,912 11,038,543 60,541,105 4,013,518
OI
217,053 48,549 32,823,761 17,257,559 2,092,507
247,097 82,723 26,612,979 35,777,041 18,930,082
334,991 86,492 24,291,050 4,826,220 1,454,770
ANOI
Notes: The definition and measurement of variables are given in Table 1 (the variables of this table are not deflated by AV t1). All of the variables are measured in thousand dollars.
8,247,583 1,596,120 514,809,009 398,965,299 33,200,797
Panel B: Mature stage Mean 17,983,380 Media 2,460,000 Minimum 65,000 Maximum 1,743,503,000 Standard deviation 88,373,059
BVC
8,083,377 1,717,529 22,331 267,600,118 23,349,166
MV
Panel A: Growth stage Mean 22,280,000 Media 4,003,000 Minimum 92,000 Maximum 1,281,037,000 Standard deviation 74,759,000
Variables
Table 2.
46 SHAW K. CHEN ET AL.
0.496 1 0.241 0.364 0.047 0.216 0.469 0.119
0.498 1 0.141 0.275 0.044 0.219 0.333 0.153
Panel B: Mature stage MV 1 BVC 0.618 CFO 0.360 CFI 0.286 CFF 0.004 R&D 0.280 OI 0.675 ANOI 0.089
Panel C: Decline stage MV 1 BVC 0.575 CFO 0.294 CFI 0.229 CFF 0.045 R&D 0.219 OI 0.569 ANOI 0.168 0.293 0.148 1 0.301 0.481 0.178 0.549 0.059
0.338 0.171 1 0.292 0.437 0.164 0.503 0.054
0.379 0.143 1 0.323 0.303 0.154 0.537 0.093
CFO
0.233 0.360 0.261 1 0.357 0.077 0.238 0.085
0.222 0.313 0.203 1 0.414 0.010 0.264 0.063
0.312 0.546 0.231 1 0.533 0.002 0.282 0.061
CFI
0.143 0.061 0.365 0.492 1 0.089 0.244 0.006
0.086 0.039 0.436 0.590 1 0.109 0.108 0.107
0.186 0.364 0.252 0.699 1 0.066 0.008 0.054
CFF
0.200 0.200 0.160 0.062 0.058 1 0.206 0.424
0.246 0.155 0.185 0.006 0.075 1 0.174 0.516
0.361 0.201 0.215 0.045 0.015 1 0.275 0.609
R&D
OI
0.486 0.374 0.586 0.208 0.211 0.202 1 0.125
0.640 0.412 0.496 0.190 0.072 0.171 1 0.089
0.640 0.431 0.510 0.270 0.073 0.286 1 0.289
po0.10; po0.05; po0.01. Note: The definitions of variables are given in Table 1. Upper: Pearson Correlation, lower: Spearman Correlation.
0.537 1 0.242 0.499 0.118 0.247 0.500 0.030
BVC
Correlation Coefficients among Variables in Different Life Cycle Stages.
Panel A: Growth stage MV 1 BVC 0.638 CFO 0.430 CFI 0.297 CFF 0.068 R&D 0.374 OI 0.702 ANOI 0.133
MV
Table 3.
0.217 0.223 0.072 0.047 0.073 0.350 0.125 1
0.079 0.247 0.010 0.038 0.042 0.538 0.053 1
0.116 0.045 0.108 0.002 0.027 0.667 0.289 1
ANOI
The Value Relevance of Financial Performance Measures 47
48
SHAW K. CHEN ET AL.
CFF (Pearson r ¼ 0.143; po0.01). In general, the observed relations among variables are consistent with our expectations. To further test for the existence of multicollinearity, we utilize the variance inflation factor (VIF).
The Impact of Life Cycle In financial accounting research, financial statements are said to be value-relevant if they are associated with equity prices, values, or returns (Hand, 2005). We examine the incremental effect of different life cycle stages on the financial performance measures and show the compared results between each two life cycle stage in Table 4. The empirical results show the estimated coefficient of the interaction term R&DSTAGE is significantly positive (Panel A and t ¼ 8.645, po0.01; Panel C and t ¼ 6.901, po0.01). This result supports our inference that R&D carries more weight for the growth stage than the other two life cycle stages. It also implies the R&D investments, especially for the firms in the growth stage, are critical for firm valuation. Relative to the later two life cycle stages, firms in the growth stage that seek to increase the competitive advantage of the company and its new products, will spend lots of money on R&D leading to future profitgenerating opportunities and increase firm value. The estimated coefficient of the interaction term ANOISTAGE is also significantly positive (Panel A and t ¼ 7.677, po0.01; Panel C and t ¼ –6.650, po0.01). These results indicate that the market has a higher positive valuation for firms which have higher R&D and ANOI in the growth stage. H1a and H1b are supported. The estimated coefficient of OISTAGE in the mature is more significant relative to the decline stage (Panel B and t ¼ 4.806, po0.01). This result reflects a fact for Taiwanese information electronics firms in the mature stage that with the ability to generate higher operating income has more positive valuation than firms in the decline stage. But the estimated coefficient of OISTAGE in Panel A is positive which means OI in the growth stage are more significant relative to the mature stage. The results only partially support H2a. On the other hand, the estimated coefficient of CFOSTAGE in the mature stage is higher than the other two stages but not significant (Panel A and t ¼ 1,206, pW0.1; Panel B and t ¼ 0.440, pW0.1). The result does not support H2b. Comparing CFFSTAGE among the three life cycle stages, we find the estimated coefficient of CFFSTAGE in the decline stage is more value relevant than in the growth stage (Panel C and t ¼ 2.626, po0.01), but not more value relevant than in the mature stage (Panel B and t ¼ 0.277,
CFF
CFISTAGE
CFI
CFOSTAGE
CFO
BVC
Intercept
Variable
The Empirical Results of Incremental Effect of Variables between Stages.
Coefficient (t-Statistic)
Panel A: The Growth stage (STAGE ¼ 1) compared with the Mature stage (STAGE ¼ 0) ? 0.159 (2.682) þ 1.472 (16.058) þ 1.998 (7.448) – 0.443 (1.206) 7 1.887 (6.753) ? 1.031 (2.933) þ 2.581 (10.571)
Expected Sign
Coefficient (t-Statistic)
Panel B: The Mature stage (STAGE ¼ 1) compared with the Decline stage (STAGE ¼ 0) ? 0.076 (1.581) þ 1.262 (16.820) þ 1.676 (4.341) þ 0.188 (0.440) 7 1.575 (4.139) – 0.024 (0.056) þ 2.538 (7.916)
Expected Sign
Coefficient (t-Statistic)
Panel C: The Decline stage (STAGE ¼ 1) compared with the Growth stage (STAGE ¼ 0) ? 0.245 (2.809) þ 1.618 (12.108) þ 1.619 (5.876) ? 0.149 (0.243) þ 0.960 (3.548) þ 0.995 (1.715) þ 1.389 (6.765)
Expected Sign
þ b7 CFF STAGE þ b8 R&D þ b9 R&D STAGE þ b10 OI þ b11 OI STAGE þ b12 ANOI þ b13 ANOI STAGE
MV ¼ b0 þ b1 BVC þ b2 CFO þ b3 CFO STAGE þ b4 CFI þ b5 CFI STAGE þ b6 CFF
Table 4.
The Value Relevance of Financial Performance Measures 49
þ
þ
þ
–
þ
þ
R&D
R&DSTAGE
OI
OISTAGE
ANOI
ANOISTAGE
0.590 422.017 3,800
1.210 (3.888) 5.474 (8.282) 8.835 (8.645) 5.259 (22.111) 2.482 (6.937) 1.972 (5.995) 4.915 (7.677)
Coefficient (t-Statistic)
?
þ
þ
þ
?
þ
–
Expected Sign 0.101 (0.277) 4.071 (4.078) 2.177 (2.077) 3.498 (8.980) 2.003 (4.806) 2.069 (6.423) 0.163 (0.419)
Coefficient (t-Statistic)
0.536 337.498 3,792
(Continued )
–
þ
?
þ
–
þ
þ
Expected Sign
0.628 278.152 2,132
1.299 (2.626) 14.011 (15.055) 11.206 (6.901) 7.609 (23.503) 4.434 (7.177) 6.676 (10.809) 4.922 (6.650)
Coefficient (t-Statistic)
Notes: CFOSTAGE, the interaction term of CFO and STAGE; CFISTAGE, the interaction term of CFI and STAGE; CFFSTAGE, the interaction term of CFF and STAGE; R&DSTAGE, the interaction term of R&D and STAGE; OISTAGE, the interaction term of OI and STAGE; ANOISTAGE, the interaction term of ANOI and STAGE; STAGE, life cycle stage.
po0.10 ; po0.05; po0.01.
Adjusted R2 F-value Number of observables
?
Expected Sign
CFFSTAGE
Variable
Table 4.
50 SHAW K. CHEN ET AL.
The Value Relevance of Financial Performance Measures
51
pW0.10). The results only partially support H3a. The estimated coefficient of CFISTAGE in the decline stage is positively significant in the decline stage than in the growth stage (Panel C and t ¼ 1.715, po0.10). The results also partially support H3b. In sum, we find that the impact of life cycle exists and is helpful to explain the association between various performance measures and firm value among Taiwanese firms.
Additional Analysis Re-Classify Life Cycle Stage Although Anthony and Ramesh (1992) and Black (1998) suggest that dividend payout is a reasonable proxy for the life cycle stage, Black (1998) indicates that low dividend payout may be related to firms in distress, particularly when cash is necessary for other purposes. Some researchers also suggest dividend payout is not appropriate as the classification indicator in Taiwan. To reexamine the hypotheses, we reclassify the sample into three life cycle stages without dividend payout as a classification indicator.3 Most of the results with our findings remain unchanged as presented in Table 5. Sub-Industry Firms in different sub-industry may have different core products, operating strategies or business environment, and the usefulness of accounting information may vary according to the changes in production function and activities of firms. The Taiwan Stock Exchange (TSE) classifies firms into eight different sub-industries from 2007. We follow the categories of TSE and refer to the classification of TEJ to group observations into eight sub-industries4 and reexamine our hypotheses. From the empirical results in Table 6, comparing the growth stage with the other two life cycle stages, H1a and H1b, is supported in six sub-industries (e.g., semiconductor (sub1), photodiode (sub3), telecommunication internet (sub4), electronic components (sub5), information service (sub7), and other electronic industry (sub8). Comparing the mature stage with the decline stages, H2a is only supported in computer and peripheral devices (sub2), electronic components (sub5), electronic channel (sub6), and information service (sub7). Comparing the mature stage with the growth stages, H2b is only supported in photodiode (sub3). However, in the decline stage, H3a and H3b are only supported in sub5 (electronic components). In sum, we find the difference among sub-industries has divergent impact on the value relevance of these
The Empirical Results of Incremental Effect of Variables between Stages – Without Dividend Payout Rate as Life Cycle Classification Indicator.
CFF
CFISTAGE
CFI
CFOSTAGE
CFO
BVC
Intercept
Variable
Coefficient (t-Statistic)
Panel A: The Growth stage (STAGE ¼ 1) compared with the Mature stage (STAGE ¼ 0) ? 0.262 (4.446) þ 1.476 (16.186) þ 1.964 (7.314) – 0.700 (1.902) 7 1.480 (5.341) ? 0.657 (1.873) þ 2.726 (11.444)
Expected Sign
Coefficient (t-Statistic)
Panel B: The Mature stage (STAGE ¼ 1) compared with the Decline stage (STAGE ¼ 0) ? 0.068 (1.437) þ 1.221 (16.334) þ 1.890 (4.807) þ 0.140 (0.748) 7 1.477 (3.641) – 0.180 (0.405) þ 2.282 (6.769)
Expected Sign
Coefficient (t-Statistic)
Panel C: The Decline stage (STAGE ¼ 1) compared with the Growth stage (STAGE ¼ 0) ? 0.432 (4.589) þ 1.723 (11.931) þ 1.380 (4.862) ? 0.957 (1.532) þ 0.975 (3.511) þ 0.858 (1.390) þ 1.470 (6.993)
Expected Sign
MV ¼ b0 þ b1 BVC þ b2 CFO þ b3 CFO STAGE þ b4 CFI þ b5 CFI STAGE þ b6 CFF þ b7 CFF STAGE þ b8 R&D þ b9 R&D STAGE þ b10 OI þ b11 OI STAGE þ b12 ANOI þ b13 ANOI STAGE
Table 5.
52 SHAW K. CHEN ET AL.
þ
þ
þ
–
þ
þ
R&D
R&DSTAGE
OI
OISTAGE
ANOI
ANOISTAGE
0.577 409.580 3,900
1.289 (4.197) 6.945 (10.676) 7.291 (6.859) 5.303 (21.473) 2.728 (7.368) 2.243 (6.844) 6.178 (8.797)
po0.10; po0.05; po0.01. Note: The definitions of variables are given in Table 1.
Adjusted R2 F-value Number of observables
?
CFFSTAGE
?
þ
þ
þ
?
þ
–
0.508 309.458 3,884
0.296 (0.779) 4.400 (4.278) 2.776 (2.603) 2.173 (5.338) 3.300 (7.547) 1.646 (5.161) 0.986 (2.533) –
þ
?
þ
–
þ
þ
0.643 269.894 1,940
0.925 (1.785) 13.879 (13.708) 9.342 (5.435) 7.845 (22.051) 6.249 (9.508) 8.086 (11.430) 6.882 (8.457)
The Value Relevance of Financial Performance Measures 53
Sub4
Sub5
Sub6
Sub7
(STAGE ¼ 1) compared with the Decline stage (STAGE ¼ 0) 2.307 1.470 0.524 0.981 0.528 2.100 0.705 0.175 1.912 0.858 0.297 1.440 0.283 1.314 1.740 1.306 6.564 3.104 0.796 1.359 0.011 0.369 3.426 2.451 6.749 1.740 1.275 1.497 0.807 0.624 0.592 0.469 0.560 0.580 0.500 580 319 1229 330 219
Sub3
Panel B: The coefficient of variables in the Mature stage 0.199 1.672 CFOSTAGE CFISTAGE 0.615 1.660 CFFSTAGE 0.487 1.485 1.009 4.805 R&DSTAGE OISTAGE 1.560 3.937 ANOI 0.527 5.196 Adjusted R2 0.562 0.622 Number of observables 526 229
Sub2
(STAGE ¼ 1) compared with the Mature stage (STAGE ¼ 0) 1.594 0.164 0.230 2.620 1.564 1.944 0.631 1.620 4.222 2.715 2.407 1.981 2.054 3.855 1.999 7.958 6.172 8.473 4.489 9.569 4.288 0.702 2.603 2.905 0.868 9.329 1.927 2.041 1.454 1.458 0.680 0.501 0.586 0.568 0.598 668 349 1109 299 222
Sub1
Panel A: The coefficient of variables in the Growth stage CFOSTAGE 0.616 2.703 CFISTAGE 0.513 4.645 1.343 2.338 CFFSTAGE R&DSTAGE 10.759 5.250 OISTAGE 1.991 1.668 ANOISTAGE 6.066 3.451 Adjusted R2 0.612 0.579 Number of observables 674 201
Variables
þ b8 R&D þ b9 R&D STAGE þ b10 OI þ b11 OI STAGE þ b12 ANOI þ b13 ANOI STAGE
0.885 1.065 0.782 4.786 0.924 3.827 0.571 360
1.113 0.309 2.491 9.783 0.354 10.550 0.672 279
Sub8
Table 6. The Partial Empirical Results of Sub-Industries between Different Stages’ Comparisons. MV ¼ b0 þ b1 BVC þ b2 CFO þ b3 CFO STAGE þ b4 CFI þ b5 CFI STAGE þ b6 CFF þ b7 CFF STAGE
54 SHAW K. CHEN ET AL.
(STAGE ¼ 1) compared with the Growth stage (STAGE ¼ 0) 0.483 0.989 0.833 2.040 1.270 0.390 2.007 1.788 2.709 2.131 1.781 1.472 2.255 2.978 0.747 8.927 16.439 11.077 10.803 8.903 4.409 0.652 6.077 5.383 7.280 7.349 3.849 3.590 2.825 1.840 0.713 0.569 0.601 0.596 0.645 350 170 668 151 107 1.968 0.537 1.962 15.884 0.878 14.085 0.735 201
Notes: The eight categories are inclusive of semiconductor (sub1), computer and peripheral devices (sub2), photodiode (sub3), telecommunication internet (sub4), electronic components (sub5), electronic channel (sub6), information service (sub7), and other electronic industry (sub8). The definitions of variables are given in Table 1.
po0.10; po0.05; po0.01 and the t-statistic is omitted in Table 6.
Panel C: The coefficient of variables in the Decline stage 1.201 1.580 CFOSTAGE CFISTAGE 1.599 0.632 2.250 0.733 CFFSTAGE R&DSTAGE 10.873 5.870 2.984 5.022 OISTAGE ANOISTAGE 5.731 4.319 Adjusted R2 0.636 0.607 Number of observables 360 124
The Value Relevance of Financial Performance Measures 55
56
SHAW K. CHEN ET AL.
six financial performance measures in different life cycle stages. These findings suggest we need further investigation related to specific characteristics of sub-industries.
CONCLUSION This study examines the financial performance measures in different life cycle stages to identify the key measures for investors when they value the firm using the data from firms in Taiwan’s information electronic industry. We find that R&D and ANOI are more value-relevant in the growth stage. In addition, OI is also more value-relevant in the mature stage, but CFO has almost the same value-relevant in all stages. In the decline stage, CFF and CFI can convey more information in the decline stage than in the growth stage, but is not more than in the mature stage. Our empirical evidence show a differential role of financial performance measures across life cycle stages, and highlights the role of cash flows and various earnings-related items in explaining firm market value. Our findings differ from Liang and Yao’s (2005) arguments that traditional financial statements face an unprecedented challenge of adequately evaluating the high-tech industry’s firms’ value. Our findings contribute to financial performance measures value-relevant research, which can be of interest to a broad public, including academic researchers, standard setters, financial statement preparers and users, and the practice. The result in this study is specifically directed to the information electronics industry in Taiwan. As the business strategies of the Taiwanese information electronics industry are multifaceted and different firms have diverse specializations, we see benefits for future research to classify and identify firms based on the product lines and the distribution channel positions in the supply chain.
NOTES 1. Because of data limitation, we do not develop the hypothesis for the start-up stage. 2. The simplicity of the firm life cycle concept makes it vulnerable to criticism (Day, 1981). In practice, we find that each stage of the firm life cycle may not occur in chronological order. For example, the Japanese company Kongo Gumi is one of the elderly firms that defy typical firm life cycle ordering. Further, other research
The Value Relevance of Financial Performance Measures
57
suggests that firms may ensure survival during the decline stage (e.g., Clarkson, 1995; Kazanjian, 1988; Jawahar & Mclaughlin, 2001; Zoltners et al., 2006). 3. After reclassifying, there are 978 firm-year observations in growth stage, 2922 firm-year observations in mature stage and 962 firm-year observations in decline stage. 4. The eight categories are inclusive of semiconductor (sub1), computer and peripheral devices (sub2), photodiode (sub3), telecommunication internet (sub4), electronic components (sub5), electronic channel (sub6), information service (sub7), and other electronic industry (sub8).
REFERENCES Anderson, C. R., & Zeithaml, C. P. (1984). Stage of the product life cycle, business strategy, and business performance. Academy of Management Journal, 27(March), 5–24. Anthony, J. H., & Ramesh, K. (1992). Association between accounting performance measures and stock prices. Journal of Accounting and Economics, 15(August), 203–227. Baginski, S. P., Lorek, K. S., & Branson, B. C. (1999). The relationship between economic characteristics and alternative annual earnings persistence measures. The Accounting Review, 74(January), 105–120. Black, E. L. (1998). Life-cycle impacts on the incremental value-relevance of earnings and cash flow measures. Journal of Financial Statement Analysis, 4(Fall), 40–56. Booth, R. (1998). The measurement of intellectual capital. Management Accounting, 76(November), 26–28. Clarkson, M. E. (1995). A stakeholder framework for analyzing and evaluating corporate social performance. Academy of Management Review, 20, 92–117. Day, G. S. (1981). The product life cycle: analysis and applications issues. Journal of Marketing, 45(Autumn), 60–67. Dodge, H. J., Fullerton, S., & Robbins, J. E. (1994). Stage of the organizational life cycle and competition as mediators of problem perception for small businesses. Strategic Management Journal, 15(February), 121–134. Fairfield, P. M., Sweeney, R. J., & Yohn, T. L. (1996). Accounting classification and the predictive content of earnings. The Accounting Review, 71(July), 337–355. Hand, J. R. M. (2003). The relevance of financial statements within and across private and public equity markets. Working Paper, Kenan-Flagler Business School, UNC Chapel Hill. Hand, J. R. M. (2005). The value relevance of financial statements in the venture capital market. The Accounting Review, 80(April), 613–648. Hofer, W. (1975). Toward a contingency theory of business strategy. Academy of Management Journal, 18(December), 784–810. Ittner, C. D., Larcker, D. F., & Randall, T. (2003). Performance implications of strategic performance measurement in financial service firms. Accounting, Organizations and Society, 28, 715–741. Jawahar, I. M., & Mclaughlin, G. L. (2001). Toward a descriptive stakeholder theory: An organizational life cycle approach. Academy of Management Review, 26(July), 397–414. Joos, P. (2002). Explaining cross-sectional differences in market-to-book ratios in the pharmaceutical industry. Working Paper, University of Rochester.
58
SHAW K. CHEN ET AL.
Jorion, P., & Talmor, E. (2001). Value relevance of financial and nonfinancial information in emerging industries: The changing role of web traffic data. Working Paper, University of California at Irvine. Kazanjian, R. K. (1988). The relation of dominant problems to stage of growth in technologybased new ventures. Academy of Management Journal, 31(June), 257–279. Klein, A., & Marquardt, C. (2006). Fundamentals of accounting losses. The Accounting Review, 81(January), 179–206. Koberg, C. S., Uhlenbruck, N., & Sarason, Y. (1996). Facilitators of organizational innovation: The role of life-cycle stage. Journal of Business Venturing, 11(March), 133–149. Konrath, L. (1990). Audit risk assessment: A discussion and illustration of the interrelated nature of statements on auditing standards. The Women CPA (Summer), 14–16. Liang, C. J., & Yao, M. L. (2005). The value-relevance of financial and nonfinancial information-evidence from Taiwan’s information electronics industry. Review of Quantitative Finance and Accounting, 24(March), 135–157. Myers, S. C. (1977). Determinants of corporate borrowings. Journal of Financial Economics, 5, 47–175. Ohlson, J. (1995). Earnings, book values and dividends in security valuation. Contemporary Accounting Research, 11(Spring), 661–687. Pashley, M. M., & Philippatos, G. C. (1990). Voluntary divestitures and corporate life cycle: Some empirical evidence. Applied Economics, 22(September), 1181–1196. Penman, S. H. (2001). Financial statement analysis and security valuation. Irwin: McGraw-Hill. Porter, M. E. (1980). Competitive strategy: Techniques for analyzing industries and competitors. New York, NY: Free Press. Robinson, K. C. (1998). An examination of the influence of industry structure on the eight alternative measures of new venture performance for high potential independent new ventures. Journal of Business Venturing, 14(March), 165–187. Robinson, K. C., & McDougall, P. P. (2001). Entry barriers and new venture performance: A comparison of universal and contingency approach. Strategic Management Journal, 22(June–July), 659–685. Zoltners, A. A., Sinha, P., & Lorimer, S. E. (2006). Sales force structure to your business life cycle. Harvard Business Review, 84(July–August), 81–89.
FORECASTING MODEL FOR STRATEGIC AND OPERATIONS PLANNING OF A NONPROFIT HEALTH CARE ORGANIZATION Kalyan S. Pasupathy ABSTRACT The article is a description of the real-life experience based on the implementation of a financial forecasting model to inform budgeting and strategic planning. The organization is a charity-based health system that has hospitals and medical centers that provide care to the community. The health system performs a central budgeting process which is typically based on aggregation of individual budgets from the various hospitals and medical centers within the system. All financial data are reported to a central financial information system. Traditionally budgeting was done based on prior year’s financial performance with a slight adjustment based on the hospital or medical center finance department’s educated guess. This article describes the new forecasting method instituted to predict revenue and expenses, and to improve the budget planning process. Finally, the forecasts from the model are compared with real data to demonstrate accuracy of the financial forecasts. The model is since then being used in the budgeting process.
Advances in Business and Management Forecasting, Volume 7, 59–69 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007007
59
60
KALYAN S. PASUPATHY
INTRODUCTION The financial industry has been receiving lot of recent attention and scrutiny due to the market crash during the fall of 2008, and a crisis has been brewing that is affecting several industries. Nonprofit healthcare organizations are no exception to this. These organizations have also been facing trickle-down effect and are forced to sustain revenue to ensure and maintain a balanced budget. Based on the 2008 American Hospital Association statistics, there are 2919 private nonprofit community hospitals in the United States. These hospitals represent 59% of the total number of community hospitals and 77% of all privately owned community hospitals. Hence, it is absolutely necessary for these hospitals and health systems to plan their strategy appropriately and allocate budget in right quantities. A charity healthcare system in the United States has a combination of hospitals and medical centers that provide care in the community. The health system performs centralized budget planning which is typically based on aggregation of individual budgets from the various hospitals and the medical centers that are part of the system. These individual budgets are traditionally based on prior year’s financial performance with a slight adjustment based on the finance department’s educated guess. However, there are several discrepancies in this planning process since it is not based on historical data or any analytical approach. Budgets are important tools that enable the management team lead an organization toward its goals. They help organizations to institutionalize goals, monitor the performance and progress of both individual products and the business, and measure the performance of different departments and managers. To be effective planning tools, budget numbers need to be realistic, honest, accurate, and challenging. Otherwise, both the purpose of strategic planning as well as manager motivation is destroyed. However, there are several challenges that organizations need to overcome to have a realistic budgetary estimation. There is always uncertainty in business environment and this makes the process of making precise predictions very difficult (Bart, 1988). Healthcare financial managers can leverage advanced forecasting techniques to better plan, manage, and adjust the decisions that will help them achieve their performance goals. Forecasts play an important role in helping an organization make strategic decisions as they can influence the resources allocated to any anticipated changes in its environment. Organizations that over invest because of a positive forecast which did not materialize suffer from higher fixed costs and overheads. Firms that
Forecasting Model for Strategic and Operations Planning
61
underestimate performance cannot keep up with increased demand and technological advancement, and thus lose their competitive advantage. Therefore, forecasting ability could be a distinctive organizational capability (Durand, 2003). The charity-based health system is interested in developing a forecasting model for revenue and expense categories. All financial data are reported to a central system. This article demonstrates a method to forecast for the revenue and expense variables for effective budget planning purposes.
METHOD Forecasting is a well-established method and has been used to estimate and predict the future to be able to plan accordingly. Financial forecasting is performed by organizations to predict how the future finances will play out, so that budgets can be effectively allocated. Stark, Mould, and Schweikert (2008) provide a series of steps to conduct financial forecasting. These steps are the following: Step Step Step Step Step
1: 2: 3: 4: 5:
Establish the business need Acquire data Build the model Evaluate the results Apply the forecast.
To establish the business need, these key questions should be answered:
What decisions will the forecast influence? Who are the key stakeholders? What metrics are needed and at what level of detail? How far forward should the forecast project in terms of years, months, weeks, or days? How will accuracy be measured, and what is the acceptable level? What is the impact of under- and over-casting? For each business driver and influencing factor, the typical forecasting effort should use at least two years, and ideally six years, of historical data. When forecasting efforts have short time horizons in small time periods, fewer data can be used. Multiple data sources like healthcare information systems (HIS), spreadsheets, small departmental databases, and/or an enterprise data warehouse, should be used to collect the most accurate, balanced, and robust data sets. A common challenge a hospital may face in
62
KALYAN S. PASUPATHY
forecasting is the practice of purging of aging data from an HIS after one or two years which makes accurate forecasting very difficult. Hospital financial managers need to ensure that historical data are available for budgeting and planning. Another challenge is that data might exist in multiple separate database systems (Medina-Borja & Pasupathy, 2007). And expert systems engineers and informaticians are necessary to mine data from these various sources (Pasupathy, 2010). The third step of forecasting is to make a decision regarding the type of forecasting model to use. The forecasting model is the technique or algorithm that determines the projections based on identified business drivers, influencing factors, and business constraints. There are three major categories of forecasting models: cause-and-effect, time series, and judgment. Forecasting models are often combined to produce the most accurate results for a given business need. Business and technical experts need to be consulted for advice when selecting the best model for a given situation. Once the model has been built and executed, the resulting forecast accuracy should be evaluated using statistical functions such as F-statistic, p-values, standard error of the estimate, or R2 values. Regression models can be used to describe what percentage of the changes from month to month can be explained by the forecast. By visualizing the results, a healthcare manager can easily understand the model’s accuracy. Model accuracy should be tracked and monitored by calculating the difference from month to month, and a forecast accuracy of more than 85% is considered to be very good. To compare forecast accuracy over time, the mean absolute percentage error (MAPE) test may be used. If the MAPE increases over time, then not all influencing factors have been included in the model. This metric allows a healthcare financial manager to evaluate whether the forecast model needs to be tuned. And, the final step is to apply the forecast. Once all the work has been done to create a high quality forecast, it should be deployed to the stakeholders and end users in a manner tailored to their use. The forecast should ideally be made accessible to all appropriate business areas in reports and analyses packaged to unique end user perspectives.
APPLICATION AND RESULTS The reminder of the article describes these steps and explains how it was used at the healthcare organization.
Forecasting Model for Strategic and Operations Planning
63
Step 1: Establish the Business Need This project was undertaken to study the trends and patterns in the financial behavior of the healthcare system with the objective of forecasting the financial state into the future. Based on the success of validation and the effectiveness of the forecasting model, it would be used for budgeting as part of strategic planning. The key stakeholders involved were directors from Corporate Strategy, Finance Department and Enterprise Risk Management and operations researchers from Health Management Engineering Department at Headquarters, representatives from the Finance Departments in the hospitals and medical centers. Based on discussions, the decision was made to look at four line items or variables, two on the revenue side and two on the expense side. They are listed in Table 1. The forecasting was conducted on a quarterly basis (3-month increments) over a time horizon of seven years, from fiscal year 2003 quarter 1 (FY03Q1) through fiscal year 2009 quarter 3 (FY09Q3). The forecast would be to predict the last quarter in 2009 (FY09Q4). The accuracy of the forecast will be measured by comparing the forecast with the actual amounts for FY09Q4. The analysis would be conducted on the mean (average) for each variable across all hospitals and medical centers over the quarter. Mean was used as the measure as opposed to total for various reasons. First, some of the hospitals and medical centers went through mergers or annexations of other hospitals and medical centers and/or clinics, and others were newly established. Hence, the number of hospitals and medical centers over the years was not the same. Here, using a total would skew the forecasting analysis. Second, hospitals and medical centers report on different type of financial forms. Hence, the various accounts used to produce the aggregate revenue and expense variables are different between the hospitals and the medical centers. To avoid any adverse effect of these factors on the forecasting, mean values were used.
Table 1. Variable Service revenue Total revenue Salary and benefits Total expenses
List of Financial Variables for this Analysis. Type
Description
Revenue Revenue Expense Expense
Income from all lines of service Total income from all sources and funds Salary and benefits to employees in all lines of service Total expenditures in all services
64
KALYAN S. PASUPATHY
Step 2: Acquire Data Data for the forecasting was obtained from historical information warehouse – Health system (Hospital & Medical Center) Financial Information System (HFIS) in Oracle. The advantage of such a system is that none of the data was purged. The historical data accounted for four quarters during the first six years and three quarters for the last year. This gave a total of 27 quarters worth of data. All the line items and account values were obtained using Structured Query Language (SQL) and aggregated to calculate the four variables for each hospital and medical center using SQL, Microsoft Access, and Microsoft Excel. The data was statistically tested for the existence of any outliers using SPSS. Next, the mean for each quarter was calculated. Data preparation Data presented a few issues and these needed to be addressed before conducting the forecasting. Some of the hospitals and medical centers, around the time of mergers and/or annexations failed to report certain data elements and hence, complete financial data was not available from all hospitals and medical centers. Hence, few hospitals and medical centers had to be excluded from the analysis because of lack of consistency in the reporting (Table 2). Table 2.
Number of Hospitals and Medical Centers Included in the Analysis.
Year
Number of Hospitals and Medical Centers Included
Total Number of Hospitals and Medical Centers
FY03 FY04 FY05 FY06 FY07 FY08 FY09
25 26 24 25 24 23 26a
26 26 25 25 25 26 30
a
Out of the total 30 hospitals and medical centers, 29 had good data. Only 26 were included to be consistent with the prior years. When analyzed with 30 hospitals and medical centers, even though the means were used, there was a considerable drop noted in the dollar amounts for FY09 which was suspected to be due to the abrupt increase in the number of hospitals and medical centers from 26 to 30 for that fiscal year.
Forecasting Model for Strategic and Operations Planning
65
Step 3: Build the Model The data for the four variables was first plotted over time. Fig. 1 shows the trend for the mean of the four variables over time from FY03Q1 through FY09Q3. All four variables exhibit an increasing trend. However, the increasing trend is pronounced in the case of service revenue, salary, and benefit expenses and total expenses. Total revenue has more of a constant trend with very little increase over time. This means that the revenue from other sources like donations, grants, and contracts has been decreasing over time. A clear seasonal pattern can be observed for the variables. For the most part, total revenue, salary, and benefit expenses and total expenses, each show a dip in the first quarter, a plateau in the second and third quarters and a peak in the fourth quarter. Service revenue, on the other hand, shows a dip in the first quarter, a peak in the second and a plateau behavior in the third and fourth quarters (Fig. 1). Since presence of seasonality is evident, analysis was done to identify the seasonality indices for each quarter for each variable. The seasonality indices were calculated using the average deviation of each quarter from the overall trend. The seasonality indices for the quarters for each variable are shown in Table 3. These can be used for future forecasting. Table 4 lists the forecasting equations for the financial variables with the corresponding p-values for analysis of variance (ANOVA) and the confidence levels. Once the forecasting models have been established and the seasonality indices are available, the forecast for the fourth quarter of fiscal year 2009 (FY09Q4) can be computed using x ¼ 28. The forecasted value for the four variables are shown in Table 5. The FY09Q4 forecast was used to compute the FY09 end of the year total values by adding this to the total of the other three quarters.
Step 4: Evaluate the Results The FY09 forecasts were compared with actual FY09 data when available. The bar chart in Fig. 2 shows the error (i.e., the difference between the forecasted amounts and the actual amounts) of the forecasting model. The forecasts for all the financial variables were overestimated compared to the actual data within 9%. The gap between the forecasts and the actual data is shown as error on the chart. The prediction of the deficit and the amount of deficit was right on target as can be seen.
66
KALYAN S. PASUPATHY Mean Service Revenue
$600,000 $500,000 $400,000 $300,000 $200,000 $100,000 FY07Q3
FY08Q1
FY08Q3
FY09Q1
FY09Q3
FY07Q3
FY08Q1
FY08Q3
FY09Q1
FY09Q3
FY07Q1
FY06Q3
FY06Q1
FY05Q3
FY05Q1
FY04Q3
FY04Q1
FY03Q3
FY03Q1
$-
Mean Total Revenue $2,000,000 $1,600,000 $1,200,000 $800,000 $400,000
Fig. 1.
FY07Q1
FY06Q3
FY06Q1
FY05Q3
FY05Q1
FY04Q3
FY04Q1
FY03Q3
FY03Q1
$-
Quarterly Trend for Financial Variables.
67
Forecasting Model for Strategic and Operations Planning
Table 3.
Quarterly Seasonality Indices for Financial Variables.
Variable Service revenue Total revenue Salary and benefit expenses Total expenses
Table 4.
Quarter 1
Quarter 2
Quarter 3
Quarter 4
0.603 0.697 0.890 0.933
1.416 0.990 0.937 0.973
0.965 0.978 0.970 0.985
1.015 1.389 1.235 1.126
Forecasting Equation for Financial Variables.
Variable Service revenue Total revenue Salary and benefit expenses Total expenses
Equation
p-Value for Regression
Confidence
4056.7xþ181938 7101.7xþ981139 12096xþ561746 15362xþ825067
0.0103 0.0016 2.2 E-10 1.07 E-10
High Very high Very high Very high
Note: x, Time period for the forecasting. For FY03Q1, x ¼ 1; for FY03Q2, x ¼ 2, etc.
Table 5.
Forecast for FY09Q4 Financial Variables.
Variable
Forecast Value
Service revenue Total revenue Salary and benefit expenses Total expenses
$ 295,525.60 $1,179,986.60 $ 900,434.00 $1,255,203.00
$140,000,000 Actual
Error
$120,000,000 $100,000,000 $80,000,000 $60,000,000 $40,000,000 $20,000,000 $Service revenue
Fig. 2.
Total revenue
Salary and benefit expenses
Total expenses
Comparison of FY09 Forecast and Actual Financials.
Deficit
68
KALYAN S. PASUPATHY
Step 5: Apply the Forecast The forecasting model and the results were presented to all the stakeholders and instituted within the organization. The overall accuracy of at least 91% was received as being very good. A decision support tool was developed based on the forecasting model to be used by the decision-makers. Several training/discussion sessions were conducted to help the decision-makers understand and take control of the evidence-based budgeting process. Further, the forecasting model will be revisited when more data is available to refine and fine tune the model. During this process, it might be advisable to drop some of the oldest quarters. This will ensure that the model continues to capture the latest trend and pattern variations in real life.
CONCLUSION This real-life implementation project demonstrated how quarterly financial data can be used to develop forecasting models and predict the future for budget planning purposes. Then the forecasts were compared against the actual amounts to evaluate the effectiveness of the forecasting models. Specifically, forecasting models for the four financial variables were developed but with slightly varying degrees of confidence. Total revenue, salary and benefit expenses and total expenses can be forecasted with high confidence, and service revenue can be forecasted with acceptable confidence. For some of the years, few of the hospitals and medical centers could not be included in the model development. Mergers and acquisitions are part of reality and the modeler needs to ensure that they do not bias the model. There were some data issues, and accuracy of the model and the confidence in the forecasting can be considerably increased by improving the quality of the financial data. With regard to quarterly data collection, simple business rules and quality checks can be included. Some of the rules can be to ensure reporting of only cumulative amounts over the quarters and to make sure that any number reported in the second quarter is at least equal to if not greater than that for the first quarter. Rules are also needed to detect data entry errors in reporting like an additional zero or neglect of a decimal point. These can be avoided by having checks that would compare the variable fields with historical data as well as the approximate budget size. Other checks need to be done to track zeroes and negative values and document reasons for these.
Forecasting Model for Strategic and Operations Planning
69
As a focus for future work, tools can be developed to test how a model will handle future conditions as part of scenario analysis. By running different scenarios, especially with known outcomes, healthcare managers can gain a comfort level of model behavior and accuracy. For example, a forecast of revenues for certain inpatient volume could test a base, a conservative, and an aggressive scenario for population growth rates, additions of new managed care populations and employer groups, and clinical initiatives to convert inpatients to outpatients. Under each scenario, the forecast should be tested and validated for its ability to provide reasonable results.
REFERENCES Bart, C. K. (1988). Budgeting gamesmanship. The Academy of Managment Executive, 2(4), 285–294. Durand, R. (2003). Predicting a firm’s forecasting ability: The roles of organizational illusion of control and organizational attention. Strategic Management Journal, 24(9), 821–838. Medina-Borja, A., & Pasupathy, K. (2007). Uncovering complex relationships in system dynamics modeling: Exploring the use of CHAID and CART. System dynamics society international conference, August. Boston, MA. (Vol. 25, pp. 2883–2906). Pasupathy, K. (2010). Systems engineering and health informatics – Context, content and implementation. In: S. Kabene (Ed.), Healthcare and the effect of technology: Developments, challenges and advancements (pp. 123–144). Hershey, PA: IGI Global. Stark, D., Mould, D., & Schweikert, A. (2008). 5 steps to creating a forecast. Healthcare Financial Management, 62(4), 100.
PART II MARKET FORECASTING
SEASONAL REGRESSION FORECASTING IN THE U.S. BEER IMPORT MARKET John F. Kros and Christopher M. Keller ABSTRACT This chapter presents an Excel-based regression analysis to forecast seasonal demand for U.S. Imported Beer sales data. The following seasonal regression models are presented and interpreted including a simple yearly model, a quarterly model, a semi-annual model, and a monthly model. The results of the models are compared and a discussion of each model’s efficacy is provided. The yearly model does the best at forecasting U.S. Import Beer sales. However, the yearly does not provide a window into shorter-term (i.e., monthly) forecasting periods and subsequent peaks and valleys in demand. Although the monthly seasonal regression model does not explain as much variance in the data as the yearly model it fits the actual data very well. The monthly model is considered a good forecasting model based on the significance of the regression statistics and low mean absolute percentage error. Therefore, it can be concluded that the monthly seasonal model presented is doing an overall good job of forecasting U.S. Import Beer Sales and assisting managers in shorter time frame forecasting.
Advances in Business and Management Forecasting, Volume 7, 73–96 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007008
73
74
JOHN F. KROS AND CHRISTOPHER M. KELLER
INTRODUCTION Forecasting is an important part of all successful marketing programs and there have been numerous articles which speak to the growing need for accurate forecasts (Keating & Wilson, 1988; Hanke & Weigand, 1994; Wilson & Daubek, 1989; Loomis & Cox, 2003; Armstrong & Brodie, 1999). Related studies have shown that computer usage in forecasting courses has grown (Wilson & Daubek, 1989; Armstrong & Brodie, 1999; Loomis & Cox, 2003). For more than twenty years, forecasters have used computerized spreadsheets in an attempt to produce more accurate forecasts. During this time, numerous researchers have published work ranging from selecting the appropriate method of forecasting (see Chase, 1997) to how to use specific forecasting approaches in spreadsheets (Albright, Winston, & Zappe, 2005; Kros, 2009; Ragsdale, 2006; Savage, 2003; Grossman, 1999). Other studies have shown that the most commonly used method of forecasting is based on linear multiple regression (Wilson & Daubek, 1989; Hanke, 1984, 1988; and Loomis & Cox, 2000). Radovilsky and Eyck (2000) provided a much needed discussion on forecasting with Excel. Although detailed, most of these works failed to address the issue of seasonality and the creation of in depth seasonal forecasting models via Excel and even fewer provide marketing instructors with the means to teach seasonally influenced forecasting effectively. Over the past decade, there has also been a trend in the use of spreadsheets to solve business problems (Grossman, 1997; Powell, 1997; Fylstra, Lasdon, Watson, & Waren, 1998; Conway & Ragsdale, 1997; Savage, 2003). Many textbooks for the traditional quantitative business analysis course now include spreadsheet modeling applications (Anderson, Sweeney, & Williams, 2005; Ragsdale, 2006). Some textbook authors utilize spreadsheet modeling as major focus of business problem-solving (Kros, 2009; Moore & Weatherford, 2001). Consequently, the use of spreadsheets in financial and process analysis has become commonplace for many managers today, and the problem-solving process now includes thinking in terms of spreadsheets rather than mathematical functions (Powell, 1997; Sonntag & Grossman, 1999). Since many managers use spreadsheets extensively, the authors feel it is extremely important to examine the development of spreadsheet forecasting models. There are several benefits to a spreadsheet models for forecasting: (a) spreadsheet use is ubiquitous in and around corporate settings, (b) data input is simple and the data output functionality needed is quick and easily interpretable, (c) some form of spreadsheet software is available without large expense for the purchase of proprietary software,
Seasonal Regression Forecasting in the U.S. Beer Import Market
75
and (d) since most firm’s employees tend to be familiar with using spreadsheets for problem-solving and in turn model results tend to be credible to all stakeholders, from mid-level managers up to corporate executive officers.
PURPOSE In this chapter, we provide readers with a forecasting example utilizing Excel and regression analysis under conditions characterized by seasonality. To accomplish the stated objective, the authors present a discussion of seasonal regression models in order to familiarize readers with this important concept. This is followed by an overview of using Excel-based regression analysis to forecast seasonal demand using Imported Beer sales data from the Beer Institute (Beer Institute, 2009). The following seasonal regression models are presented and interpreted including a simple yearly model, a quarterly model, a semi-annual model, and a monthly model. The results of the models are compared and a discussion of each model’s efficacy is provided.
LITERATURE REVIEW Corporations typically need forecasts that cover different time spans in order to achieve operational, tactical, and strategic intents. Firms typically use monthly data from the last one or two years to achieve operational or short-term forecasting. Tactical forecasting is generally based on quarterly data from the last five to six years or comparable annual data. Strategic forecasting generally requires additional periods in order to make projections for 25 or more years into the future (Lapide, 2002; Zhou, 1999). Tactical and strategic forecasting is made more complex when the issue of seasonality is added to the analysis. One goal of a forecasting model is to account for the largest amount of systemic variation in the behavior of a time series data set as possible. Moving average, exponential smoothing, and linear regression models all attempt to account for systemic variation. However, each of these models may fail due to additional systemic variation that is not accounted for. In many business time series data sets a major source of systemic variation comes from seasonal effects. Seasonal variation is characterized by increases or decreases in the time series data at regular time intervals, namely calendar
76
JOHN F. KROS AND CHRISTOPHER M. KELLER
or climatic changes (Ragsdale, 2006). For example, sales demand for beer in the United States has increased over time but tends to vary during the year and to be higher in the Spring and Summer months than in the Fall and Winter months (http://www.foodandbeveragereports.com). Therefore, time is not the only variable that has an impact on beer sales; multiple factors play a role. Multiple linear regression models are a commonly used technique in forecasting when multiple independent variables impact a dependent variable. From the earlier example, beer sales could be considered the dependent variable while time and a seasonal factor could be considered independent variables, and is represented in the following general model for multiple linear regression: Y t ¼ b0 þ b1 X 1t þ b2 X 2t þ b3 X 3t þ þ bn X nt þ et
(1)
where X1t is the time and X2t through Xnt are seasonal indicators. The X’s denote the independent variables while the Y denotes the dependent variable. For example, the X1t term represents the first independent variable for the time period t (e.g., X11 ¼ 1, X12 ¼ 2, X13 ¼ 3, etc.). The et term denotes the random variation in the time series not accounted for by the model. Since the values of Yt are assumed to vary randomly around the regression function, the average or expected value of et is 0. Therefore, if an ordinary least squares estimator is employed the best estimate of Yt for any time period t is: Y^ t ¼ b0 þ b1 X 1t þ b2 X 2t þ b3 X 3t þ þ bn X nt
(2)
Eq. (2) represents the line passing through the time series that minimizes the sum of squared differences between actual values (Yt) and the estimated values ðY^ t Þ. In the case, when n ¼ 1 the equation represents simple regression. However, if a data set contains seasonal variation a standard multiple linear regression model generally does not provide very good results. With seasonal effects, the data tend to deviate from the trend lines in noticeable patterns. Forecasts for future time periods would be much more accurate if the regression model reflected these drops and ascents in the data. Seasonal effects can be modeled using regression by including indicator variables, which are created to indicate the time period to which each observation belongs. The next sections present the seasonal regression forecasting models. The first is a simple yearly model, a quarterly model, a semi-annual model, and finally a monthly model.
77
Seasonal Regression Forecasting in the U.S. Beer Import Market
Many publicly held companies are required to submit quarterly (data occurring in cycles of three months – a common business cycle) reports regarding the status of their businesses. It is also very common for quarterly forecasts to be used to develop future quarterly projections. Therefore, if quarterly data were being analyzed, the indicator variables for a quarterly regression model could be stated as follows: ( 1; if Y t is an observation from the first quarterly period of any year X 2t 0; otherwise (3) ( X 3t
1; if Y t is an observation from the second quarterly period of any year 0; otherwise (4)
( X 4t
1; if Y t is an observation from the third quarterly period of any year 0; otherwise (5)
( X 5t
1; if Y t is an observation from the fourth quarterly period of any year 0; otherwise (6)
Notice the unique coding used to define X2t, X3t, X4t, and X5t in Eqs. 3 through 6. Table 1 summarizes the coding structure. Take note that the X1t variable is time. Overall, this coding structure can apply to any time frame the forecaster chooses (e.g., monthly seasonal periods, or peak versus Table 1.
Quarterly Seasonal Coding Structure.
Quarterly Period
1 2 3 4
– – – –
Jan/Feb/Mar Apr/May/Jun Jul/Aug/Sept Oct/Nov/Dec
Value of Independent Variable X2t
X3t
X4t
X5t
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
78
JOHN F. KROS AND CHRISTOPHER M. KELLER
Semi-Annual Seasonal Coding Structure.
Table 2. Semi-Annual Period
Value of Independent Variable X2t
1 – Jan–May 2 – Jun–Dec
0 1
Monthly Seasonal Coding Structure.
Table 3. Monthly Period 1 – Jan 2 – Feb 3 – Mar 4 – Apr 5 – May 6 – Jun 7 – Jul 8 – Aug 9 – Sept 10 – Oct 11 – Nov 12 – Dec
Value of Independent Variable X2t
X3t
X4t
X5t
X6t
X7t
X8t
X9t
X10t
X11t
X12t
X13t
1 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 1
non-peak seasons) and each of the models that will be discussed herein. For brevity purposes, only the quarterly model is illustrated here. The coding structures for the models are contained in Tables 1, 2, and 3.
THE SEASONAL MODEL In general, if we identify p seasons, we only need p1 indicator variables in the regression model. Therefore, any one of the seasonal indicators can be omitted and the choice of which indicator to omit is left to the modeler. For example, in our quarterly seasonal example although four quarterly periods are identified only three indicator variables are needed. The excluded quarterly period will simply serve as a base-level period to measure changes in the other seasonal levels. Therefore, if the first quarterly indicator, X2t is excluded (i.e., X2t will always be equal to 0) the seasonal regression model
Seasonal Regression Forecasting in the U.S. Beer Import Market
79
can then be developed as follows: Y t ¼ b0 þ b1 X 1t þ b3 X 3t þ b4 X 4t þ b5 X 5t
(7)
The coefficients b3, b4, and b5 indicate the expected effects of seasonality in the second, third, and fourth quarterly periods, respectively, relative to the first quarterly period. To better understand how the regression model behaves, notice that for observations occurring in the first quarterly period (i.e., X 3t ¼ 0; X 4t ¼ 0; and X 5t ¼ 0) the regression model reduces to: Y^ t ¼ b0 þ b1 X 1t
(8)
This model represents the predicted values of import beer sales in the 1st period indexed on time. This analogy applies for the semi-annual and bimonthly seasonal models also. In the following examples, the first seasonal indicator is omitted for each of the three models studied. Import Beer Sales Example Fig. 1 displays three views of the partial data set for United States Import Beer sales in millions of barrels (Brls) from January 1999 to December 2007 in an Excel spreadsheet. The views are displayed to illustrate the differing setups or data structure that accompanies each seasonal regression model. The data structure or setup in Excel is particularly important when using Excel’s regression procedure. In other words, how the data is input and in which order it is featured (e.g., what data is in the first column versus the second column, etc.). View (a) in Fig. 1 displays the semi-annual model, view (b) the quarterly model, and view (c) the monthly model. Take note that (1) the data in the last column in each view is obtained after the Regression procedure is performed in Excel and is explained later in this chapter, (2) the first seasonal indicator is omitted in each model’s setup (see discussion in the Seasonal Model section), and (3) that the independent variables (time and seasonal indicators) are all contiguous (i.e., side by side with no gaps). A discussion on residuals appears later in the section on calculating error. Forecasting is an integral business activity that is covered to varying degrees in most business school curricula. One of the features often uncovered even in practitioners is the effect of granulizing the data. That is, data occurs annually, semi-annually, quarterly, bi-monthly, and even monthly. While most modelers simply assume that more data is better and
80
JOHN F. KROS AND CHRISTOPHER M. KELLER 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 1999
2000
2001
2003
2004
2005
2007
Year
Fig. 1.
U.S. Beer Import Sales, 1999–2007.
will always choose the most granular data, this chapter illustrates how increasing the granularity of the data, especially in the presence of seasonal effects, may strongly affect the ultimate efficacy of the model. Monthly data is available for sales for the period 1999–2007. In forecasting data over time, one of the most important perspectives to consider is the level of detail required for the forecast. In this data, the first forecast estimate that should be generated is not a monthly estimate but is instead the predictable annual growth in sales as shown in Fig. 2. Depending on the management requirements, this forecast using simple linear regression may be sufficient. But, in many cases, such as managing inventory that does not have annual capacity, it is necessary to generate a forecast over a smaller interval. Such models are constructed below: semiannual, quarterly, and monthly. Semi-annual. Generally one perceives that finer focused data may produce better estimates. But, as shown in Fig. 3, a linear forecast is constructed based upon the summary semi-annual sales actually reduces the overall explanatory power of the model because it introduces more variability. That is, not only is there variability from year-to-year, with more finer data there is also now variability within each year.
81
Seasonal Regression Forecasting in the U.S. Beer Import Market 35 30 Barrels = 1.3446*YrIndicator - 2669.2 R2 = 0.9459
25 20 15 10
Actual Predicted Linear (Actual)
5 0 1999
2000
2001
2002
2003
2004
2005
2006
2007
Year
Fig. 2.
Annual Increase in U.S. Beer Import Sales.
16 14 12
Barrels = -1334.6+0.6723*YrIndicator +0.0094*SemiIndicator R2 = 0.9223
10 8 6 4 Actual Predicted Linear (Actual)
2 0 1999
2000
Fig. 3.
2001
2002
2003 Year
2004
2005
2006
Simple Semi-Annual Increase in U.S. Beer Import Sales.
2007
82
JOHN F. KROS AND CHRISTOPHER M. KELLER
It should be noted that the values of the model are exactly 1/2 of the values of the annual model which would be expected. The two extensions considered next are the addition of an indicator variable for the second half of the year and an alternative definition of the annual increase as pro-rated across the semi-annual periods. We find that in neither case, does that finer focus improve the estimated model’s R2 value. However, the models do allow insight into the monthly characteristics of import beer sales. A semi-annual indicator variable is added which has the value of 0 for the first six-month period of any year, and has the value of 1 for the second six-month period of any year. The results of this regression are shown in Fig. 4. It is also important to note that a multiple regression, with two independent variables Year and Semi-Annual period, does not produce a superior model with regard to R2. The t-statistics of the model show that the YrIndicator variable is highly statistically significant with a value of 0.672 as well as the intercept with a value of 1334.61, but that the Semi-Annual period is not statistically significant for any practical value of alpha since its t-statistic is only 0.04. This example shows that it is not always the case, that more and finer granularity to the data produces a superior estimate. Note that the extension of this is also not true, as shown below. That is, 18 16
Barrels = -1334.6+0.6723*YrIndicator +0.0094*SemiIndicator R2 = 0.9223
14 12 10 8 6 4
Actual Predicted
2 0 1999
2000
2001
2002
2003
2004
2005
2006
2007
Year
Fig. 4.
Semi-Annual Indicator Variable Model for Increase in U.S. Beer Import Sales.
83
Seasonal Regression Forecasting in the U.S. Beer Import Market
because one granularity focus of a data set is not useful, does not establish that no finer granular focus is also not useful. One of the most common examples of granularity in time series modeling is using a time trend variable: 1, 2, 3, y In the present example, the year variable that was modeled had values 1, 1, 2, 2, 3, 3, y (where 1 ¼ first year of data, 1999; 2 ¼ second year of data, 2000; etc.) That is, the year variable captured the annual increase which was not pro-rated across semi-annual periods. If instead one used not the annual increase but the semi-annual increase, along with the semi-annual indicator variable, then similar results obtain as shown in Fig. 5. Of course, the R-squared of the model and the coefficient of the semiyears variable are products of the annual model. However, what has changed is the statistical sign, scale, and statistical significance of the Indicator variable. Previously it was positive; in the present model, it is negative. Previously it is of the order 10,000 barrels; presently it is of the order 300,000 barrels. Previously its t-statistic was a scant 0.04; presently its t-statistic is 1.26. Although the indicator variable is not statistically significant for any alpha less than 22%, the drastic change in the variable is important for the modeler to consider. In the particular case, the question 18 16 14
Barrels = 8.942+0.3361*YrIndicator2+(-0.3267)*SemiIndicator R2 = 0.9234
12 10 8 6 4 Actual Predicted
2 0 1999
2000
2001
2002
2003
2004
2005
2006
2007
Year
Fig. 5.
Semi-Annual Indicator Variable Model for Increase in U.S. Beer Import Sales.
84
JOHN F. KROS AND CHRISTOPHER M. KELLER
is whether or not the observed annual increase in beer sales should be pro-rated across the individual time units. In the present model, both cases shown that the indicator variable was not statistically significant and should be removed. There is no general answer of whether or not to pro-rate increases observes during a larger time unit across smaller included time units. The linearity of the model will ensure that the coefficients are linearly related, but the statistical reliability of the model may vary either higher or lower. The important point is that the modeler should consider whether or not linear changes observed at a larger time interval should or should not be pro-rated across its included smaller time intervals. Having established that the annual increase does not appear to be better modeled as occurring linearly over the year, but rather is a distinctly year increase, in the next models, this consideration is foregone. Since most modelors do not consider this alternative, for illustrative purposes, it will be included again and compared in the final model obtained. Quarterly. The categorical variables for quarters are coded and the resulting multiple regression retains an R2 of 92% and its value is Barrels ¼ 668.03 þ 0.3361YrIndicator þ 1.453Q2 þ 1.0955Q3 þ 0.3671 Q4. All of the coefficients are highly statistically significant, with Q4 significant to the largest alpha value of 1.8% (see Fig. 6). The model suggests that the first quarter is the expected lowest sales and the highest sales are expected in early summer (Q2), followed by slighty lower sales in late summer (Q3), and continuing this decline to Q4, still about 1/3 of a million barrels above first quarter sales. This model and the resulting multiple linear regression estimate are shown in Fig. 7. We have seen that the annual model provided a very good overall estimate of annual increases in sales, but the semi-annual extension did not provide additional information. However, when further granulizing the data to quarterly the estimate was improved further and did provide information that the semi-annual forecast could not. This process is continued for monthly estimates below. Monthly. The categorical variables for months are coded and the resulting multiple regression containing the preivously determined annual increase and quarterly changes is examined. It is also important to remember that when modeling nested categorical variables, that one variable must always be ‘‘left out.’’ In the present case, since the variable Q2 contains the months April, May, and June, it is necessary in constructing the monthly variables to leave out one of these months. In the present model, the first variable of each quarter is left out and there are no monthly variables for January,
85
Fig. 6.
Quarterly Seasonal Model Regression Output.
Seasonal Regression Forecasting in the U.S. Beer Import Market
86
JOHN F. KROS AND CHRISTOPHER M. KELLER 9 8
Barrels = -668.03+0.3361*YrIndicator+1.453*Q2+1.0955*Q3+0.3671*Q4 R2 = 0.9287
7 6 5 4 3 2
Actual Predicted
1 0 1999
2000
2001
2002
2003
2004
2005
2006
2007
Year
Fig. 7.
Multiple Regression Using Quarterly Categorical Variables.
Table 4.
Monthly Multiple Regression Estimates.
April, July, and October. The results from a monthly multiple regression are shown in Table 4. All of the remaining variables are statitiscally significant except for August and November (shaded in Table 4). A final estimation is considered
Seasonal Regression Forecasting in the U.S. Beer Import Market
Table 5.
Intercept YrIndicator Q2 Q3 Q4 Feb Mar May Jun Sep Dec
87
Revised Multiple Regression Estimates. Coefficients
Standard Error
t-Stat
222.89 0.1120 0.5674 0.6980 0.3791 0.1486 0.5024 0.1943 0.2077 0.3476 0.1192
0.056 0.005 0.069 0.059 0.059 0.069 0.069 0.069 0.069 0.059 0.059
17.526 20.635 8.260 11.735 6.373 2.164 7.314 2.830 3.024 5.844 2.004
after removing these two variables and the results are shown in Table 5. Each of the coefficients is statistically signficant at least at the 5% level. The R-squared of the model is 88.1% and the standard error is 146,000 barrels (Fig. 8). The monthly model is shown in Fig. 9. The coefficient values found in Table 5 provide additional insight. These coefficients are used to construct the seasonal regression equation itself. Not all of the model parameter coefficients are positive for the monthly seasonal models. However, the coefficients are relative to each other within each model. For the quarterly model, it is seen that the Q4 (Oct, Nov, Dec) coefficient is lower than the coefficients for Q2 and Q3. This is also true for the Sept and Dec coefficients as they are actually negative compared to the other monthly coefficients. Each of these parameters corresponds to the time frame of the early Fall into Winter seasons. In turn, it could be tentatively concluded that some type of winter effect is influencing the model. The supposition of the ‘‘winter’’ effect comes from what the coefficients denote within the regression model itself. Within regression models, positive coefficients move in unison with changes in the model parameter. For example, if any parameter with a positive coefficient increases there will be an increase in the predictor, Y^ t , all things held constant. However, if that parameter’s coefficient is smaller relative to the other parameter’s coefficients the increase in the predictor, Y^ t , will be markedly less, all things held constant (i.e., it contributes less, hence the moniker ‘‘coefficients of contribution’’ being applied to the independent variable coefficients). From our example, it should be noted that the coefficient of the Q4 parameter is lower and that the Sep and Dec coefficients are negative when
Fig. 8.
Monthly Seasonal Model Regression Procedure Output.
88 JOHN F. KROS AND CHRISTOPHER M. KELLER
89
Seasonal Regression Forecasting in the U.S. Beer Import Market 4.0 3.5
Barrels = -222.89+0.112*YrIndictor+0.5674*Q2+0.698*Q3+0.3791*Q4+0.1486*Feb +0.5024*Mar+0.1943*May+0.2077*Jun+(-0.3476*Sep)+(-0.1192*Dec) R2 = 0.8811
3.0 2.5 2.0 1.5 1.0 Actual Predicted
0.5 0.0 1999
2000
2001
2003
2004
2005
2007
Year
Fig. 9. Monthly Seasonal Regression of Beer Sales.
compared to the other coefficients in the monthly model therefore they provide a damping effect on their predictors. The lower and/or lower coefficients associated with the Q4, Sep, and Dec parameters are most likely attributable to the holiday lag and temperature drop that occurs every year around November, December, January, and February in the United States. It is well known that beer sales in the United States lag during the colder months of each year and the holiday season (i.e., Thanksgiving, Christmas, and New Years, respectively) due to consumers drinking other beverages, both alcoholic, non-alcoholic, and non-chilled (e.g., think champagne or eggnog around the holidays and Irish Coffee or hot cider during the colder months) (http://www.foodandbeveragereports.com). As a final note, if the annual time trend is pro-rated across the time periods (i.e., 1999 ¼ 1, 2000 ¼ 2, etc.) then the R2 drops slightly to 88.0%. Whereas it is natural and very common for modelers to pro-rate the annual increase per month by using an indexing time variable, it is not required. To further illustrate the concept consider that the quarterly increase is never considered for pro-rating across the quarter, because it appears to be an indicator for something else. Similarly, the annual increases may reflect not linear growth, but planning growth in terms of annual budgets, forecast, or other items that do in fact not occur on a monthly basis
90
JOHN F. KROS AND CHRISTOPHER M. KELLER
but on an annual basis and so the change should be modeled annually, not pro-rated per month.
Interpretation and Analysis of Regression Outputs Nash and Quon (1996) provide a valuable overview of issues in teaching statistical thinking with spreadsheets. They discuss numerous advantages as well as deficiencies regarding employing spreadsheets for statistical teaching. One very important point they make is the verification of statistical output from spreadsheets via traditional statistical packages (e.g., SPSS). It must be noted here that all Excel regression results presented herein were verified via SPSS 13.0. One advantage to employing Excel’s regression procedure is the ease in interpreting the results of the regression analysis and understanding its ‘‘statistical quality’’. To interpret the regression analysis, readers should begin by checking the values for three statistical measures: the R Square (R2) statistic, the F-statistic, and the t-statistic. The value for R2 provides an assessment of the forecast model’s accuracy. Specifically, it can be interpreted as the proportion of the variance in Y attributable to the variance in the X variables. Generally, values above 0.7 provide a minimum threshold of accuracy for business models, while values above 0.8 are considered very good (Kros, 2009). Table 6 presents the R2 values for each of the seasonal regression models, whereas Table 7 presents the F-statistics for each model. According to the analysis any of the proposed seasonal regression models are statistically significant and could be used to effectively develop forecasts for U.S. import beer sales. However, a good manager needs to know which model produces the ‘‘best’’ results. The next section illustrates how to construct the seasonal regression equation for the monthly model, defines and describes error for each model, and discusses model efficacy via error coupled with the regression statistics. Table 6.
R2-Statistics for Various Seasonal Regression Models.
Model Type
R2
Annual Semi-annual Quarterly Monthly
0.9459 0.9234 0.9287 0.8811
91
Seasonal Regression Forecasting in the U.S. Beer Import Market
Table 7.
F-Statistics for the Various Seasonal Regression Models.
Model Type
F-Statistic
Annual Semi-annual Quarterly Monthly
122.44 90.45 100.99 71.86
Construction of the Seasonal Regression Equation and Error Calculation From a mathematical perspective, the coefficient values are easily inserted into the original regression model to yield the seasonal regression forecasting equation. To illustrate the construction of the seasonal regression equation, the monthly seasonal model output is employed. In turn, the monthly seasonal regression equation can be written as: y^ ¼ 222:89 þ 0:112 YrIndictor þ 0:5674 Q2 þ 0:698 Q3 þ 0:3791 Q4 þ 0:1486 Feb þ 0:5024 Mar þ 0:1943 May þ 0:2077 Jun þ ð0:3476 SepÞ þ ð0:1192 DecÞ
ð9Þ
To demonstrate the seasonal regression equation, a forecast for January 2008 will be created. Since the forecasting time period is January of 2008, the YrIndicator ¼ 2008. Since January falls in the first quarter and is the first month of the year there are no quarterly or monthly indicators in the model and in turn Q2 ¼ 0, Q3 ¼ 0, Q4 ¼ 0, Feb ¼ 0, Mar ¼ 0, May ¼ 0, Jun ¼ 0, Sep ¼ 0, and Dec ¼ 0. Therefore, the seasonal regression model produces the following result: y^ ¼ 222:89 þ 0:112 YrIndictor ¼ 222:89 þ 0:112 2008 ¼ 2:006
ð10Þ
The forecasted U.S. import beer sales for January 2008 are 2.006 million Brls. Calculating the error associated with a forecasting model is cornerstone to determining a model’s performance. Coupling the error measurement with the regression statistics allows a forecaster to comment on a model’s overall efficacy. A model that performs well on all statistical measures and carries relatively low error can be deemed adequate. Error, also referred to as deviation, is defined as the actual less the predicted value. Excel can generate error terms for a regression model when prompted.
92
JOHN F. KROS AND CHRISTOPHER M. KELLER
Fig. 10.
Partial Example of Monthly Seasonal Regression Residual Output.
A forecaster can instruct Excel to automatically calculate the error terms or the residuals of a regression model by checking the Residuals box in the Regression Dialogue Box. When the regression model is computed via the Regression Dialogue Box, the residual output is generated and returned below the regression output. Fig. 10 displays a partial example of the residual output for the monthly seasonal regression model within Excel. A standard measure of error in forecasting is mean absolute percentage error or MAPE and is calculated as follows: n 1X Actual Predicted MAPE ¼ n t¼1 Actual
(11)
Seasonal Regression Forecasting in the U.S. Beer Import Market
Table 8.
93
Seasonal Regression Model MAD Comparisons.
Seasonal Model
MAPE
Annual Semi-annual Quarterly Monthly
3.0% 3.1% 3.7% 5.7%
Using the residual terminology from Excel’s regression tool, the formula would resemble the following: n 1X Residual MAD ¼ (12) n t¼1 Actual To obtain MAPE for the residuals contained in Fig. 10, a user would first take the absolute value of each residual in the residual column (e.g., third column, Fig. 10), divide that absolute residual by the associated actual, and then take an average of those absolute percentage errors. Forecasters generally compute error for multiple forecasting models and then compare across each model where the model with the lowest error is considered superior. The MAPE for each of the seasonal models is presented in Table 8. From Table 8, the yearly regression model has the lowest MAPE of the three models. However, it does not provide detail that a manager may need to make decisions in a shorter time frame (e.g., monthly inventory evaluation). Therefore, a manager most likely will use the yearly model for long range planning purposes and may use the monthly model for shorter range planning purposes. It is readily apparent from Fig. 9 that the monthly seasonal regression model fits the actual data very well by accounting for the seasonal trends in import beer sales over time. The figure also visually reinforces the calculated value of R2 (0.88) presented earlier. Based on the regression statistics, MAPE, and the Predicted versus Actual graph (Fig. 9), it can be determined that the monthly seasonal model is doing an overall good job of forecasting U.S. Import Beer Sales.
CONCLUSIONS, RECOMMENDATIONS, AND LIMITATIONS Seasonality is found in almost every facet of marketing, from products to placement to promotion to price. It can be seen from the example and
94
JOHN F. KROS AND CHRISTOPHER M. KELLER
subsequent model that Excel can accommodate seasonal data fairly easy. Although effective in the development of seasonal models it must be noted that Excel does lack certain sophisticated statistical methods common to specialized forecasting software packages. However, this does not diminish the fact that Excel can and should be used to model seasonal patterns in data. In that Excel is ubiquitous across industry PC software suites and widely employed by business managers for their everyday forecasting needs are substantial enough reasons to lend credence to it use. The authors believe that critical thinking is paramount in the forecasting context and especially when seasonality is present. It has been said that when students develop a forecast with easy-to-use software they do not take the time to assess the reasonableness of the forecast or the evaluate what the underlying model is doing or even question if the model was appropriate to begin with. Still others believe that students have a tendency to blindly accept the results of a computer forecasting model without evaluating it (also see Hunt, Eagle, & Kitchen, 2004). However, as this work has shown, it is not difficult for a spreadsheet user to overcome the mentality of ‘‘whatever comes out of the model must be correct’’ and that students can be taught critical thinking regarding forecasting models. The authors suggest to marketing educators the following areas to focus on when imparting upon students the intricacies of seasonal forecasting models: Is the forecast performing well based on the historical pattern? Is the model being used appropriate for the forecasting problem? Is there supporting information that should accompany the forecast to help the user better understand the forecast? Are the final results from the model not only realistic but are they believable and not ‘‘black box’’ type of work? Not only is this a call for a ‘‘reality check’’ but it also raises the ‘‘black box’’ issue in forecasting. Many business people have expressed confusion about how forecasts are derived (i.e., they come from the magical ‘‘black box’’) and need reassurance that the numbers they are being given aren’t just something spit out of a complicated yet unseen device (i.e., the ‘‘black box’’). The authors believe that by providing the model in a common venue, a spreadsheet, seasonal forecasting done in Excel can put clients more at ease with the model as well as the results. Limitations of this research could include a faculty member’s own lack of computer skills which could inhibit the quality of its presentation. There is also the natural reluctance to add yet another assignment to an already full
Seasonal Regression Forecasting in the U.S. Beer Import Market
95
schedule, especially, when it requires more time and effort in terms of preparation and grading. Another limitation arises from the issue of time. Most faculty feel pressed to cover all of the material in a given text and may be reluctant to spend more time on a topic necessitating the removal of another. Ultimately, the decision of whether or not to adopt active learning technologies such as teaching students how to develop seasonal forecasting models with Excel is one that requires careful consideration. On the one hand instructors want their students to be prepared for the life in the business world and on the other as a member of a larger College of Business we must comply with accreditation mandates. However, the authors sincerely believe that most faculty within Colleges and Schools of Business possess the needed computer skills and drive to create and deliver seasonal regression models in Excel.
REFERENCES Albright, S. C., Winston, W., & Zappe, C. (2005). Data analysis and decision making with Microsoft Excel. Cincinnati, OH: South-Western College Publishing. Anderson, D. R., Sweeney, D. J., & Williams, T. A. (2005). Quantitative methods for business. Cincinnati, OH: Cengage Learning-Thompson Publishing. Armstrong, J. S., & Brodie, R. J. (1999). Forecasting for marketing. In: G. J. Hooley & M. K. Hussey (Eds), Quantitative methods in marketing (2nd ed, pp. 92–119). London: International Thompson Business Press. Chase, C. W., Jr. (1997). Selecting the appropriate forecasting method. The Journal of Business Forecasting, 16(3), 2, 23, 28–29. Conway, D. G., & Ragsdale, C. T. (1997). Modeling optimization problems in the unstructured world of spreadsheets. Omega, 25, 313–322. Fylstra, D., Lasdon, L., Watson, J., & Waren, A. (1998). Design and use of Microsoft Excel solver. Interfaces, 28(5), 29–55. Grossman, T. A. (1999). Why spreadsheets should be in or/ms practitioner’s tool kits. OR/MS Today, 26(4), 20–21. Hanke, J. (1984). Forecasting in business schools: A survey. Journal of Forecasting, 3, 229–234. Hanke, J. (1988). Forecasting in business schools: A follow-up survey. Unpublished Manuscript. Hanke, J., & Weigand, P. (1994). What are business schools doing to educate forecasters. The Journal of Business Forecasting Methods and Systems, 13(3), 10–12. Hunt, L., Eagle, L., & Kitchen, P. J. (2004). Balancing marketing education and information technology: Matching needs or needing a better match? Journal of Marketing Education, 26(1), 75–88. Keating, B., & Wilson, J. H. (1988). Forecasting: Practices and teachings. Journal of Business Forecasting, 6(10–13), 16. Kros, J. F. (2009). Spreadsheet modeling for business decisions. Dubuque, IA: Kendall/Hunt.
96
JOHN F. KROS AND CHRISTOPHER M. KELLER
Lapide, L. (2002). New developments in business forecasting. The Journal of Business Forecasting, 21(1), 12–14. Loomis, D. G., & Cox, J. E. (2003). Principles for teaching economic forecasting. International Review of Economics Education, 2(1), 69–79. Moore, J. H., & Weatherford, L. R. (2001). Decision modeling with Microsoft Excel. New York: Prentice Hall. Nash, J. C., & Quon, T. K. (1996). Issues in teaching statistical thinking with spreadsheets. Journal of Statistics in Education, 4(1). Available at http://www.amstat.org Powell, S. G. (1997). From intelligent consumer to active modeler: Two MBA success stories. Interfaces, 27(3), 88–98. Radovilsky, Z., & Eyck, J. T. (2000). Forecasting with excel. The Journal of Business Forecasting, 19(3), 22–27. Ragsdale, C. (2006). Spreadsheet modeling and decision making. Cincinnati, OH: South-Western College Publishing. Savage, S. L. (2003). Decision making with insight. Boston: Duxbury Press. Sonntag, C., & Grossman, T. A. (1999). End user modeling improves R&D management at AgrEvo Canada, Inc. Interfaces, 29(5), 132–142. Wilson, J. H., & Daubek, H. J. (1989). Teaching forecasting: Marketing faculty opinions and methods. Journal of Marketing Education, 11, 65–71. Zhou, W. (1999). Integration of different forecasting models. The Journal of Business Forecasting, 18(3), 26–28.
DATA SOURCE Beer Institute. (December 15, 2009). Beer Institute. http://www.beerinstitute.org.
A COMPARISON OF COMBINATION FORECASTS FOR CUMULATIVE DEMAND Joanne S. Utley and J. Gaylord May ABSTRACT This study examines the use of forecast combination to improve the accuracy of forecasts of cumulative demand. A forecast combination methodology based on least absolute value (LAV) regression analysis is developed and is applied to partially accumulated demand data from an actual manufacturing operation. The accuracy of the proposed model is compared with the accuracy of common alternative approaches that use partial demand data. Results indicate that the proposed methodology outperforms the alternative approaches.
INTRODUCTION The forecasting literature contains a number of articles that have shown that the combination of two demand forecasts is often more accurate than either of the component forecasts (example.g., see De Menzes et al., 2000, and Russell & Adam, 1987). The forecast combination models discussed in the literature typically use time series data for historical demand.
Advances in Business and Management Forecasting, Volume 7, 97–110 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007009
97
98
JOANNE S. UTLEY AND J. GAYLORD MAY
However, many companies possess partially accumulated demand data in addition to data on past sales. For instance, in the hospitality industry, a reservations system routinely furnishes information on partial demand for future time periods. Similarly, in a retail operation, partially accumulated demand data for a season offer a preview of cumulative demand for the season. This study will develop a linear programming methodology for forecasting a cumulative variable. The proposed methodology combines forecasts based on both historical demand and partially accumulated demand data. Data from an actual manufacturing operation will be used to illustrate the model. The accuracy of the proposed combination methodology will be compared with the accuracy of forecasts generated by alternative models that use partial demand data. Before model formulation and evaluation, a brief overview of the literature on partial demand forecasting models and forecast combination will be presented in the following section.
OVERVIEW OF FORECAST MODELS When preparing a combined forecast that incorporates both historical demand data and partial demand data, a manager must first decide what type of component forecast models could be used and then decide how these component forecast models should combined. In making these two decisions the manager has a wide variety of quantitative models from which to choose. Some of the possible models are straightforward and quite easy to implement while others are complex and require knowledge of statistics or mathematical programming. To facilitate the discussion of possible component models, this study will use the following notation: a particular time period; the longest lead time in the forecast process (or the maximum customer designated lead time); h a specific lead time where hrL; D(t, h) the partially accumulated demand for period t occurring h or more periods in advance of t (or the sum of advance orders for period t for which customer supplied lead timeZh); D(t) the total (cumulative) demand for period t; F(t, h) forecast for total demand in period t made h periods in advance of period t; t L
A Comparison of Combination Forecasts for Cumulative Demand
99
P(t, h) the cumulative proportion of total actual demand for a time period t realized via accumulated advance orders for period t, h periods in advance; C(t, h) the exponentially smoothed cumulative proportion of total actual demand for a time period t realized via accumulated advance orders for period t, h periods in advance; S(t, h) the exponentially smoothed value of the unknown component of total demand for period t, h, or more periods in advance of t. If a manager would like to choose a component forecast model that is intuitively appealing and easy to apply, he or she might select the basic multiplicative model, which is sometimes referred to as the naı¨ ve model. The basic multiplicative model assumes that D(t, h), the partially accumulated demand for period t occurring h or more periods in advance of t, is the product of D(t), the cumulative demand for period t, and the cumulative proportion P(t, h). This assumption implies that a forecast for cumulative demand in period t can be found by dividing D(t, h) by P(t, h). For example, if a manager knows h periods in advance of period t that 25% of total demand for period t will already be known, and if the sum of advance orders for period t ¼ 200 units, then the forecast for period t will be 200/.25 ¼ 800 units (Kekre, Morton, & Smunt, 1990). The naive model assumes that the P(t, h) values (or cumulative proportions) are stable over time. However, in practice, the demand build up curves may change over time; these changes may result from an altered relationship between supply and demand or from new ordering patterns produced by deteriorating (or improving) economic conditions (Bodily & Freeland, 1988). When the build up curves change over time, the corresponding P(t, h) values will also vary over time. In this situation, a forecaster can use exponential smoothing to update the cumulative proportions to reflect the new build up patterns (Bodily & Freeland, 1988). These smoothed cumulative proportions (C(t, h)) can then be used to forecast actual total demand: Fðt; hÞ ¼
Dðt; hÞ . Cðt; hÞ
(1)
In their simulation study, Bodily and Freeland (1988) showed that the multiplicative model given by Eq. (1) outperformed alternative models based on Bayesian analysis. Bodily and Freeland (1988) also showed that the model given by Eq. (1) outperformed a fairly robust alternative model based on the smoothed time series for actual total demand. This alternative
100
JOANNE S. UTLEY AND J. GAYLORD MAY
model used exponentially smoothed shipments inferred with smoothed gtj factors, where gtj is the portion of actual demand for period t that is booked exactly j periods in advance of t. It should be noted that this model requires two types of smoothing constants a and b, where a is the smoothing constant for the factors and b the smoothing constant for shipments. Determining the best a and b (where ‘‘best’’ means minimizing the MAD or mean absolute deviation) for each lead time (h) will increase forecast accuracy when this model is implemented (Bodily & Freeland, 1988). Another intuitively appealing component forecast model involves the additive approach. This method models the total demand for a time period as the sum of the partially accumulated demand and an estimate of the unknown (or unrevealed) segment of total demand. A forecast for period t generated by the additive model h or more periods in advance is given by: Fðt; hÞ ¼ Dðt; hÞ þ Sðt; hÞ,
(2)
where S(t, h) is the smoothed value of the unknown segment of total demand. Kekre et al. (1990) tested the multiplicative model and the additive model and found that each outperforms exponential smoothing for forecasts of up to four periods in advance. They also reported that while the multiplicative model performs better than the additive model for one period ahead forecasts, the multiplicative model, unlike the additive model, tends to overestimate demand as lead times increase (Kekre et al., 1990). Kekre et al. (1990, p. 123) suggest that ‘‘One way to combine the best of both forecasting techniques would be to construct a forecasting model that that would add the two components.’’ Such a model would take the form: Fðt; hÞ ¼ a þ b Dðt; hÞ,
(3)
where a and b are estimated from historical data. Obviously, this model reduces to the multiplicative model if a ¼ 0. In contrast, if b ¼ 1, then this model reduces F(t, h) to the sum of the known portion of future demand (or the D(t, h) value) and the unknown portion of future demand (or the a value). Using Kekre et al.’s (1990) suggestion as a point of departure for their research, Guerrero and Elizondo (1997) devised a set of L simple linear regressions to model total demand as a function of partially accumulated demand: DðtÞ ¼ b0 þ b1 Dðt; hÞ þ et
for h ¼ 1; 2; . . . ; L.
(4)
Guerrero and Elizondo (1997, p. 188) note that their statistical approach minimizes subjective input from the manager when a forecasting technique
A Comparison of Combination Forecasts for Cumulative Demand
101
is selected. They used both the print shop order data found in the paper by Kekre et al. (1990) and also Mexican economic data to test their approach. Guerrero and Elizondo (1997) found that their approach compared favorably with multiplicative techniques. Finally, another method for developing a forecast that exploits advance order data involves a Bayesian approach. This approach was the basis for some of the earliest papers on partial demand forecasting and was used to solve the style goods problem. In this problem, there are only limited opportunities for buying or producing a product, the selling period is finite and actual sales become known over the finite selling period (Chang & Fyffe, 1971; Fildes & Stevens, 1978; Green & Harrison, 1973; Murray & Silver, 1966). Bestwick (1975) and Guerrero and Elizondo (1997) have criticized the complexity of the Bayesian approach, noting that less formidable mathematical models are needed in practice. Once the component forecast models have been selected, it is necessary to find a way for combining them to produce a single forecast for total demand. While many possible approaches exist, a number of forecast combination models based on weighted averages have been studied (Clemen, 1989). The simplest model uses the arithmetic mean, which assumes equal weights for all component forecasts: Ft ¼
Xfj n
,
(5)
where fj is the component forecast j and n the number of component forecasts to be combined. Although this approach has performed well in practice and has actually outperformed more complex mathematical models in several studies, it does not account for the relative accuracy of the component forecasts (Dielman & Pfaffenberger, 1988). In contrast, the ‘‘outperformance’’ technique proposed by Bunn (1975) produces a weighted average in which each weight represents the fraction of time that a specific component forecast model has outperformed the other component forecast models in the combination (Bunn, 1975; De Menzes et al., 2000; Gupta & Wilton, 1987). Linear weights are also generated when the optimal method proposed by Bates and Granger (1969) is used for forecast combination. The optimal method selects weights to minimize the error variance of the combined forecast (Bates & Granger, 1969). The optimal method can produce a biased combined forecast when biased component forecasts are used (Granger & Ramanthan, 1984).
102
JOANNE S. UTLEY AND J. GAYLORD MAY
Unlike the optimal model, ordinary least squares (OLS) regression produces an unbiased forecast even when component forecasts are biased (De Menzes et al., 2000). In the OLS approach, component forecasts serve as the independent variables while the observed value for the forecasted variable is the dependent variable (De Menzes et al., 2000). Thus, if two component forecasts (f1 and f2) are used in the combination, the model has the form: Dt ¼ b0 þ b1 f 1t þ b2 f 2t þ et ,
(6)
where Dt is the actual value of the forecasted variable for period t, fjt the forecast for period t generated by component forecast model j, b0 the constant term, bj the regression coefficient for component forecast j, and et the error term for period t. The OLS regression model shown in Eq. (6) will minimize the sum of the squares of the error terms for the forecast; however, in some practical applications it may be advisable to minimize the sum of the absolute values of the error terms instead. In this case, a least absolute value (LAV) regression model should be used (Birkes & Dodge, 1993). Forecasters have tended to neglect multiple LAV regression as a possible forecast combination technique, despite the fact that the LAV model tends to be less sensitive to outliers than the OLS approach (Dielman & Pfaffenberger, 1988; Narula & Korhonen, 1994). While there are no formulas for the constant term and regression coefficients in the LAV model, estimates for these parameters can be found via linear programming (Charnes, Cooper, & Ferguson, 1955; Wagner, 1959). If it is assumed that the LAV model utilizes two component forecasts ( f1 and f2), then the LAV combined forecast for total demand in period t is given by: F t ¼ b0 þ b1 f 1t þ b2 f 2t .
(7)
The linear programming formulation for this LAV model can be stated as: P Minimize ðPi þ N i Þ Subject to : b0 þ b1 f 1i þ b2 f 2i þ Pi N i ¼ Di for i ¼ 1; 2; . . . ; n All Pi ; N i 0; b0 ; b1 & b2 unrestricted; where b0 is the regression constant, bj the regression coefficient for component forecast model j (j ¼ 1, 2), f1i the component forecast 1 for period i, f2i the component forecast 2 for period i, Di the actual value of total demand in period i, Pi ¼ Di ðb0 þ b1 f 1i þ b2 f 2i Þ (the positive deviational variable for constraint i), N i ¼ ðb0 þ b1 f 1i þ b2 f 2i Þ Di (the negative
A Comparison of Combination Forecasts for Cumulative Demand
103
deviational variable for constraint i), and n the number of observations in the data set. In the following section, the linear programming model shown above will be used in conjunction with partial order data and historical data from an actual manufacturing operation to produce a forecast for total demand. Results from this model application will be compared with the results from alternative forecast models that use partial order data.
CASE STUDY To evaluate the performance of the multiple LAV model discussed in the preceding section, the model was applied to data from an actual manufacturing shop. The shop produced a variety of electronic components for a number of customers. For simplicity, this section will discuss the problem of forecasting demand for just one of the components produced. Analysis of historical demand data for this component revealed that the order quantity varied with each customer order. Customer designated lead times also varied with each order. Typically, these lead times ranged from 1 month to 4 months, although on rare occasions customers specified lead times as long as 5 or 6 months. Nine months of historical demand data for this component were available. The forecast problem for this manufacturer was to predict total demand for this component for a 4-month planning horizon that extended from month 10 to month 13. At the end of month 9, advance order data for months 9–13 were also available. Working with both the historical data for months 1–9 and the partial order data for months 10–13, the authors devised a forecast based on the LAV forecast combination model discussed in the previous section. The forecast process consisted of a series of stages which are summarized in the following sections.
Stage 1: Preliminary Data Analysis The advance order data for total demand actually realized in months 1–9 were organized by the customer specified lead times (h). For all practical purposes, L, the maximum customer designated lead time was 4. Four time series for partially accumulated demand were generated for months 1–9. Thus the time series for hZ1 consisted of the values D(t, 1) where t ¼ 1, 2, 3, y , 9, the time series for hZ2 consisted of the values D(t, 2)
104
JOANNE S. UTLEY AND J. GAYLORD MAY
Table 1.
Exponentially Smoothed P(t, h)Values.
Month
C(t, 1)
C(t, 2)
C(t, 3)
C(t, 4)
1 2 3 4 5 6 7 8 9 C(t, h) Best Alpha MAD
.975 .975 .9756 .9785 .9852 .974 .974 .9616 .9056 .9054 .31 .032
.953 .953 .9661 .9629 .9753 .9551 .9619 .9281 .7856 .7859 .77 .037
.804 .804 .8224 .8935 .8775 .918 .9459 .8854 .6397 .6406 .68 .0884
.47 .47 .398 .7 .663 .787 .922 .83 .429 .202 .99 .1737
where t ¼ 1, 2, 3, y , 9 and similar series were generated for hZ3 and hZ4. Since actual total demands for months 1–9 were known, it was possible to compute four additional time series P(t, h) for months 1–9 where P(t, h) was defined as the cumulative proportion of total actual demand realized via accumulated advance orders for month t, h months in advance. Analysis of these cumulative proportions showed that they did not remain constant over time for any of the four lead time values. To adjust for this instability in cumulative proportions, the authors developed exponentially smoothed forecasts for these values. For each time series, the authors used the alpha value that minimized the MAD. The four smoothed series are shown in Table 1.
Stage 2: Developing the Multiplicative Forecasts Using the four time series for D(t, h) generated in Stage 1, the four time series for the exponentially smoothed proportions (C(t, h)) listed in Table 1, and the multiplicative model given by Eq. (1), the authors developed four forecast time series for total demand for months 1–9. These forecast time series are listed in Table 2 as the f1 time series.
Stage 3: Exponential Smoothing of the Multiplicative Forecasts The f1 time series found in Stage 2 were exponentially smoothed to produce a second set of forecasts for total demand for months 1–9. These forecasts
A Comparison of Combination Forecasts for Cumulative Demand
Table 2.
105
Multiplicative Forecasts: Months 1–9.
Month
f1(hZ1)
f1(hZ2)
f1(hZ3)
f1(hZ4)
1 2 3 4 5 6 7 8 9 MAD
321 267 262 197 244 193 175 171 262 6.89
321 271 259 196 246 195 174 169 262 7.4
321 275 293 188 270 202 165 124 263 19.67
321 226 457 183 300 226 164 108 124 43.44
Table 3.
Exponentially Smoothed Multiplicative Forecasts: Months 1–9.
Month
f2(hZ1)
f2(hZ2)
f2(hZ3)
f2(hZ4)
1 2 3 4 5 6 7 8 9 Best Beta MAD
321 321 267 262 197 244 193 175 171 .99 41.67
321 321 271 259 196 246 195 174 169 .65 42.67
321 321 294 294 232 254 224 190 152 .47 49.33
321 321 269 372 268 286 253 203 151 .25 60.0
are listed in Table 3 as the f2 time series, where f2 is defined by: Dðt 1; hÞ þ ð1 bÞ½f 2 ðt 1; hÞ. f 2 ðt; hÞ ¼ b Cðt 1; hÞ
(8)
To generate the f2 time series the authors selected the beta value for each h (h ¼ 1, 2, 3, 4) that minimized the MAD. This forecast model is similar in approach to the smoothed shipment series model studied by Bodily and Freeland (1988).
106
JOANNE S. UTLEY AND J. GAYLORD MAY
Table 4.
LAV Regression Combination Models.
Month
Intercept
b1 Coefficient
b2 Coefficient
LAV Forecast
10 11 12 13
142.877 22.329 128.779 194.527
0.0 .918 0.0 .143
.384 0.0 .427 0.0
243 365 235 201
Stage 4: Establishing the LAV Combination Forecasts Using the f1 series from Table 2, f2 series from Table 3 (as component forecasts), and the LAV approach discussed earlier, the authors developed an LAV model for each customer designated lead time h (h ¼ 1, 2, 3, 4). These LAV models are shown in Table 4. Since advance order data were available for months 10–13, it was possible to estimate total demand for months 10–13 with the component forecast models. The advance order data and the f1 and f2 values for months 10–13 are given in Table 5.
Stage 5: Establishing the Alternative OLS Models The OLS regression approach proposed by Guerrero and Elizondo (1997) was applied to the data to generate a series of alternative forecasts. For each of the four lead time periods, total monthly demand D(t) was modeled as a linear function of D(t, h). The OLS models are given in Table 6. This table illustrates that the accuracy of the OLS model (as measured by the MAD) declined as the length of the lead time period increased.
Stage 6: Model Comparison Table 7 compares the monthly error measures for the multiplicative forecasts (f1) and the LAV forecasts. The LAV models outperformed the multiplicative forecasts for months 10, 12, and 13. The MAD for the f1 series was 96; the MAD for the LAV model was 20.5. The mean absolute percent error (MAPE) for the f1 series was 39.9%; the MAPE for the LAV forecast was 6.67%. Table 8 compares the monthly error measures for the smoothed forecasts (f2) and the LAV forecasts. The LAV models outperformed the smoothed
107
A Comparison of Combination Forecasts for Cumulative Demand
Table 5. Month 10 11 12 12
Partial Demand Data and Component Forecasts: Months 10– 13. D(t, 1)
D(t, 2)
D(t, 3)
D(t, 4)
f1
f2
190
168 293
114 196 63
71 64 18 9
210 373 98 45
261 167 249 212
Table 6. Dependent Variable D(t) D(t) D(t) D(t)
Table 7.
OLS Regression Results Months 1–9. Independent Variable
Intercept
Beta
MAD
D(t,1) D(t,2) D(t,3) D(t,4)
37.0814 51.7565 102.0647 243.122
.8937 .8551 .7004 .0389
10.008 14.7902 26.794 38.497
Error Measures Multiplicative Model versus LAV Combination Forecast Model.
Month
Actual Total Demand
Multiplicative Forecast (f1)
Error Terms Multiplicative Forecast
10 11 12 13 MAD MAPE
235 405 264 206
210 373 98 45
25 32 166 161 96 39.9%
Combination Error Terms Forecast Combination Forecast 243 365 235 201
8 40 29 5 20.5 6.67%
models for months 10, 11, and 13 in the forecast horizon; the smoothed forecast had a smaller error in month 12. The MAD for the smoothed series was 71.25 and the MAPE was 19.6%. Tables 7 and 8 show that the LAV combination outperformed the two component forecast approaches even though both component forecasts exploited the availability of advance order data for months 10–13. Table 9 compares the monthly error measures for the OLS forecasts and the LAV forecasts. The LAV model outperformed the OLS approach for all four months in the forecast horizon. The MAD for the OLS series was 71.25
108
JOANNE S. UTLEY AND J. GAYLORD MAY
Table 8.
Error Measures Smoothed Multiplicative Model versus LAV Combination Forecast Model.
Month
Actual Total Demand
Smoothed Forecast (f2)
Error Terms Smoothed Forecast
Combination Forecast
Error Terms Combination Forecast
10 11 12 13 MAD MAPE
235 405 264 206
261 167 249 212
26 238 15 6 71.25 19.6%
243 365 235 201
8 40 29 5 20.5 6.67%
Table 9.
Error Measures LAV Model versus OLS Model.
Month
Actual Total Demand
LAV Combination Forecast
Error Terms LAV Combination Forecast
OLS Forecast
Error Terms OLS Forecast
10 11 12 13 MAD MAPE
235 405 264 206
243 365 235 201
-8 40 29 5 20.5 6.67%
207 302 146 242
28 103 118 36 71.25 24.87%
and the MAPE was 24.87%. The MAD for the LAV forecast series was 20.5 and the MAPE was 6.67%.
DISCUSSION The forecast model proposed in this study differs from other forecast models that incorporate advance order data because it utilizes a combination methodology rather than a single model. The two forecast models that were combined are each rather easy for a practitioner to understand and implement. In addition, both component models include an updating process for the cumulative proportions P(t, h) so that the forecaster can compensate for changes in the demand build up curves over time. The forecast combination method in this study utilized multiple LAV regression rather than the more commonly used OLS combination approach. The LAV method offers several advantages. First, the LAV model minimizes
A Comparison of Combination Forecasts for Cumulative Demand
109
the sum of the absolute deviations of the error terms – rather than minimizing the sum of the squares of the error terms as in the case of OLS regression. Consequently, the LAV model can be readily compared to each component forecast series, which used smoothing constants that minimized the MAD. Second, LAV regression is more robust to departures from the usual normality assumptions than is OLS regression. Third, the LAV model can be written as a linear program which allows sensitivity analysis. This analytical feature is especially useful in studying the effect of a mis-specified point. While the LAV approach offers advantages, it also presents some challenges to practitioners. First, some managers may prefer a simpler combination methodology because they lack the mathematical training needed for more complex approaches. Second, managers with some statistical training may prefer the alternative OLS forecast model (given by Eq. (4)) which can be routinely modeled on Excel. However, some linear programming software packages such as LINGO are also fairly accessible to practitioners with training in quantitative methods. Furthermore, in this research context the LAV approach proved more accurate than the alternative OLS model. This study provided an illustrative example of a forecast combination technique that incorporated lead time data. Additional research needs to be done to compare the performance of the LAV model to that of other combination techniques. The results from this study demonstrated that the LAV combination method provided greater forecast accuracy than either of the component forecasts or the alternative OLS model. This finding suggests that combination techniques – used in conjunction with established forecast models for advance order data – can lead to greater accuracy in demand forecasting for future planning horizons.
REFERENCES Bates, J., & Granger, C. (1969). The combination of forecasts. Operational Research Quarterly, 20, 451–468. Bestwick, P. (1975). A forecast monitoring and revision system for top management. Operational Research Quarterly, 26, 419–429. Birkes, D., & Dodge, Y. (1993). Alternative methods for regression. New York: Wiley. Bodily, S., & Freeland, J. (1988). A simulation of techniques for forecasting shipments using firm orders-to-date. Journal of the Operational Research Society, 39, 833–846. Bunn, D. (1975). A Bayesian approach to the linear combination of forecasts. Operational Research Quarterly, 26, 325–329.
110
JOANNE S. UTLEY AND J. GAYLORD MAY
Chang, S., & Fyffe, D. (1971). Estimation of forecast errors for seasonal style goods. Management Science, 18, l89–l96. Charnes, A., Cooper, W., & Ferguson, R. O. (1955). Optimal estimation of executive compensation by linear programming. Management Science, 1, 138–151. Clemen, R. (1989). Combining forecasts: A review and annotated bibliography. International Journal of Forecasting, 559–583. De Menzes, L., Bunn, D., & Taylor, J. (2000). Review of guidelines for the use of combined forecasts. European Journal of Operational Research, 120, 190–204. Dielman, T., & Pfaffenberger, R. (1988). Least absolute value regression: Necessary sample sizes to use normal theory inference procedures. Decision Sciences, 19(4), 734–743. Fildes, R., & Stevens, C. (1978). Look- no data: Bayesian forecasting and the effects of prior knowledge. In: R. Fildes & D. Woods (Eds), Forecasting and planning. New York: Prager. Granger, C., & Ramanthan, R. (1984). Improved methods of forecasting. Journal of Forecasting, 3, 197–204. Green, M., & Harrison, P. (1973). Fashion forecasting for a mail order company using a Bayesian approach. Operational Research Quarterly, 24, 193–205. Guerrero, V., & Elizondo, J. (1997). Forecasting a cumulative variable using its partially accumulated data. Management Science, 43(6), 879–889. Gupta, S., & Wilton, P. (1987). Combination of forecasts: An extension. Management Science, 33(3), 356–372. Kekre, S., Morton, T., & Smunt, T. (1990). Forecasting using partially known demands. International Journal of Forecasting, 6, 115–125. Murray, G., & Silver, E. (1966). A Bayesian analysis of the style goods problem. Management Science, 12(11), 785–797. Narula, S., & Korhonen, P. (1994). Multivariate multiple linear regression based on the minimum sum of absolute errors criterion. European Journal of Operational Research, 73, 70–75. Russell, T., & Adam, E. (1987). An empirical evaluation of alternative forecasting combinations. Management Science, 33(10), 1267–1276. Wagner, H. (1959). Linear programming techniques for regression analysis. American Statistical Association Journal, 54, 202–212.
CHANNEL SHARE PREDICTION IN DIRECT MARKETING RETAILING: THE ROLE OF RELATIVE CHANNEL BENEFITS Eddie Rhee ABSTRACT The direct marketing retailers have traditionally provided mail order and call center channels. In the emergence of Internet channel, the direct marketing retailers have reported a large increase in the use of Internet channel, and some have encouraged the customers to use the Internet channel more than other channels due to potential cost savings for the firm. However, over a decade of Internet usage, the traditional Call Center channel has not disappeared in the direct marketing industry. This study is motivated by this observation and incorporates the variables that capture the benefits of using different channels in the multi-channel choice model. We apply the proposed model to a transactional database from a direct marketing retailer that operates multiple channels. Our empirical result shows that the multi-channel choice model that incorporates the channel benefits has stronger channel share prediction power than the model without. It further shows that consumers are more likely to choose the Internet channel when the consumer has low perceived risk and high experience and familiarity with the purchase, but they are more likely to Advances in Business and Management Forecasting, Volume 7, 111–120 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007010
111
112
EDDIE RHEE
choose the Call Center when the consumers have high perceived risk and low experience and familiarity.
INTRODUCTION Direct marketing retailers have traditionally mailed catalogs and provided consumers with order channels such as mail order and call center (David Sheppard Associates, Inc. 1999). With the advent of the Internet channel, the direct marketing retailers have reported a channel shift toward the Internet channel. The direct marketing retailers have perceived high cost of maintaining a Call Center and encouraged their customers to use the Internet channel for its convenience and availability. However, we have observed over a decade of Internet usage that the Internet channel never eliminated the other channels in direct marketing retailing. Even though the Internet channel usage has increased, the usage of other traditional channels has maintained its position. This industry trend implies that the consumers may perceive relative benefits of using different channels, and due to these benefits, it would be the best interest of the direct marketing retailers to maintain the traditional channels as well as the Internet channel. In maintaining the channels, they would want to efficiently allocate their resources to the channels since some channels such as Call Center are expensive to manage. Then they would want to have an accurate prediction of the relative channel shares for an efficient allocation of the resources. Under the observed multi-channel usage behavior in the industry, the current study proposes a multi-channel choice model that investigates the relative benefits of using different channels and demonstrates that such model structure would produce a better prediction of the channel share of the direct marketing retailer than the model without such structure and also provide insights about the relative benefit of channels in direct marketing retailing. Studies in marketing literature discuss some factors that influence the channel usage behavior. Balasubramanian, Raghunathan, and Mahajan (2005) suggest five goal factors that influence the consumer’s channel choices at different stages of purchase process. They argue that psychological factors are important in the choice process. However, this study does not provide any empirical study to test the influences. The current study focuses on the perceived shopping cost in the channel choice process given our transaction database of a direct marketing retailer.
Channel Share Prediction in Direct Marketing Retailing
113
Other marketing studies investigate the contribution of channel to the firm’s revenue and loyalty. Kumar and Venkatesan (2005) show that customers who shop across multiple channels provide higher revenues and higher share of wallet and are more active than those who shop in one channel. However, this study does not investigate the channel choice behavior that will make the consumer a multi-channel shopper. Shankar, Smith, and Rangaswamy (2003) find that loyalty to a service provider is higher when it is chosen online versus offline. The current study focuses on the channel choice within a direct marketing setting where there are only remote channels. Since the remote channels carry the same products and prices and have no distance effect, we are able to study the channel choice behavior with these factors controlled. Other studies explore channel cannibalization. Biyalogorsky and Naik (2003) find that online sales do not significantly cannibalize retail sales and that the firm’s online activities build long-term online equity. Dleersnyder, Inge, Gielens, and Dekimpe (2002) find little evidence that the introduction of a newspaper website cannibalizes circulation of the print version. However, these studies do not consider under what conditions the Internet channel will not cannibalize the sales of other channels. We consider the nature of the shopping basket and investigate the effect on the channel choice. Other study discusses the influence of acquisition channels on the customer retention and cross-buying. Verhoef and Donkers (2005) find that for a financial services provider, customers acquired through different channels differ with respect to retention rate and cross-buying behavior. However, this study does not investigate what channels the customers use after they are acquired. The current study focuses on the channel choice behavior after acquiring the customers. Other study investigates channel migration. Ansari, Mela, and Neslin (2008) find that Internet has negative relationship with a firm’s future sales, and the catalogs have a positive effect on the purchase inertia. They also find that this negative impact of Internet on demand can be mitigated by the firm’s marketing activities. However, they do not consider the nature of a shopping basket. The remainder of the chapter is organized as follows: we first describe a direct marketing retailer’s customer transaction data. We then explain the details of the multi-channel choice model discussing its components. The model is then applied to the retailer’s database, and we show how much the channel share prediction improves and discuss the estimation results. Finally, we discuss managerial implications and conclude by discussing possible steps for future research.
114
EDDIE RHEE
DIRECT MARKETING RETAILER’S TRANSACTION DATA We use a database of transactions from January 2002 to December 2004 with a direct marketing retailer. The retailer offers a number of channels for placing the order: Internet, Call Center, Mail Order, and Voice Response System (VRS). Mail Order is a traditional order channel that the consumers use post cards to manually fill up the product orders and mail. The VRS is an automatic phone system that the consumers call the firm and input the orders using the touch keys. The final data set consists of 782 customers with 13,958 observations. We holdout the last four purchases for model performance test.
MULTI-CHANNEL CHOICE MODEL In studying the channel choice of the consumers of the direct marketing retailer, we consider the relative benefits of using channels related to the problem-solving situations of the consumer. In a routine problem-solving situation, the consumer would have low perceived risk and high experience and familiarity with the purchase. However, in an extended problem-solving situation, the consumer would have high perceived risk and low experience and familiarity with the purchase. Now we formulate a multinomial logit model for channel choice at the individual level. exp TSCh;c;t Pðh; c; tÞ ¼ 4 (1) P exp TSCh;c;t c¼1
where TSCh,c,t is total perceived shopping cost of the household h for an order channel c at time t. Since the probability of using a channel increases as the total perceived shopping cost decreases, we put a negative sign on the cost. The total perceived shopping costs for household h for channel c in time t are given by: TSCh;c;t ¼ CACh;c;t þ BCh;c;t þ SFh;c;t þ h;c;t
(2)
where CACh;c;t is the perceived channel access cost associated with household h using channel c at time t, BCh;t is the perceived shopping basket–related costs, and h;c;t represents any other variables not captured by
Channel Share Prediction in Direct Marketing Retailing
115
the model. We also include the situational factors SFh;c;t to control for the seasonality and time trend effect. In the transaction database, seasonality exists in January and July, and these seasonal changes in sales need to be recognized in the model. Also, due to obvious industry trend, the Internet channel usage has increased, and this time trend effect is also controlled for.
Perceived Channel Access Cost The perceived channel access costs represent the costs of accessing the channel regardless of the nature of what is in the shopping basket. CACh;c;t ¼ b0c
(3)
where b0c represents the perceived cost of accessing the channel c. Since this is an intercept in the shopping cost model, this can be interpreted as the perceived cost of using the channel in the absence of any other costs. The perceived access costs of Call Center may include the perceived cost of dialing up, potential wait times and the time involved in providing account information, and going through a verification process. The perceived access costs of Internet channel may consist of the perceived cost of loading the retailer’s home page, logging in, navigating the web site, and checking out. Since the computer is connected to the Internet in many homes, and the log-in name and password are already saved in the computer, the perceived cost of accessing the Internet channel may not be as high as that of the Call Center.
Perceived Shopping Basket–Related Costs The perceived shopping basket–related costs are the costs that the customers perceive due to the nature of the shopping basket. BCh;c;t ¼ b1c BasketSizeh;t þ b2c Valueh;t þ b3c Firsth;t
(4)
BasketSizeh,t is the household h’s number of products in the basket. In the Internet channel, the consumer perceives the cost of time and effort of entering the product orders in the shopping cart. In the Call Center, the consumer spends the time and effort in reading out the products to the call center associate and receiving a confirmation. It would take more effort to talk to a sales associate in the Call Center than to enter product orders in the shopping cart online.
116
EDDIE RHEE
Valueh,t is defined by the total monetary value of the products in the shopping basket. This variable represents the degree of perceived risk with the purchase. Consumers anticipate the accuracy of the decision-making and the amount of effort they put and choose a particular strategy that represents the best trade-off for the task at hand (Payne, Bettman, & Johnson, 1993). When the monetary value of the products in the shopping basket is high, the consumer naturally perceives higher risk and may be willing to put larger amount of effort. In the Internet channel, the consumers can simply put products in the shopping cart, but in the Call Center, the consumers need to put time and effort in talking to the sales associates. However, the Call Center would enable the consumer to ask about the high-priced products that bears perceived risk and hence provide a higher accuracy of ordering the right products for the consumers. Firsth,t is defined by the proportion of the product categories in the shopping basket that are the first purchases for household h. For instance, if all the product categories in the shopping basket are the first purchases to the household, the First variable becomes one. This variable represents the experience/familiarity of the products in the shopping basket. Alba and Hutchinson (1987) define familiarity as the number of product-related experiences that have been accumulated by the consumer. They find that the ability to elaborate on given information, generating accurate knowledge that goes beyond what is given, improves as familiarity increases. Therefore, when there is low familiarity to the products in the shopping basket, the consumers would have less ability to process information and less accurate knowledge about the products. Since the Call Center associates provide higher accuracy, the consumers would be more likely to choose the Call Center when there are unfamiliar products in the shopping basket.
Situational Factors SFh;c;t ¼ g1c Janh;t þ g2c Julyh;t þ g3c Timeh;t
(5)
where Janh;t is the seasonality dummy variable that is defined to be one if the purchase date is in January. Julyh;t is another seasonality dummy variable that is one if the purchase date is in July. Timeh,t captures the time trend in the data. For instance, since the shopping data are for three years (36 months), Time ¼ 1 if the order takes place in the first month of the data and Time ¼ 36 if the order takes place in the last month of the data.
117
Channel Share Prediction in Direct Marketing Retailing
EMPIRICAL APPLICATION The multi-channel choice model is applied to the database of the direct marketing retailer. Table 1 shows that the proposed model performs better than the alternative model. The alternative model contains the perceived channel access cost only. Our proposed model obtains a holdout sample hit rate of 56.37%. vs. 32.3% for the alternative model. This performance test result implies that it is important to recognize the perceived shopping basket–related costs in the channel share prediction.
Estimation Results Beginning with the perceived channel access cost, the estimated perceived channel access costs are positive and statistically significant for the Mail Order, Call Center, and VRS order channels. The cost for Internet channel is not statistically significant, which means that the perceived channel access cost for Internet channel is virtually none in relative to other channels. The estimated perceived cost of accessing Call Center is as high as 27.810 as opposed to non-significant access cost of Internet channel. This result shows that the cost of dialing up, potential wait times, and the time in providing account information and going through a verification process in the Call Center are perceived to be very high. In addition to the channel access cost that is fixed for each channel, the results of perceived basket–related costs tell more interesting story. The Basket Size coefficients for the Internet and Mail Order channels are positive and significant, but those for other channels are not significant. When there are more products in the shopping basket, the consumers are more likely to use Internet channel than the Call Center. The Value coefficients for the Internet and VRS are negative and significant, but those for the Call Center channel are positive and significant. When the value of Table 1.
Hit rate
Model Performance.
Proposed Model (Perceived Channel Access and Shopping Basket–Related Costs)
Alternative Model (Perceived Channel Access Cost Only)
56.37%
32.3%
Notes: Prediction is computed with a holdout sample. The hit rates are statistically different (po.05).
118
EDDIE RHEE
the basket is higher, the consumers are more likely to use the Call Center than the Internet. Finally, the First coefficient for the Internet and the Call Center are both positive and significant. Since the coefficients for both channels are positive, it is likely that the consumers use both channels for information gathering when the purchase is a new experience for them. They may gather product information from the Internet channel and also ask the Call Center associates for questions about the product. However, the magnitude of coefficient for the Call Center is much larger. The consumers are more likely to choose the Call Center than the Internet channel when the products are new to them. The situational factors are included in the model for control for seasonality and time trend, but there are still two noticeable results. First, the consumers are more likely to use the Call Center channel in January purchases. It is when many consumers return their products and look for post-holiday deals and they probably need more information from the Call Center associates. Second, there is a clear time trend that the Internet usage increases over time when the usage of other channels decreases (Table 2).
Managerial Implications The industry trend shows that the Internet channel share has increased, but the traditional channels have also maintained their position. Once the Internet channel is set up and running, the cost of maintaining the website would be relatively cheap but the cost of maintaining the Call Center would be costly and challenging for the direct marketing retailer. Hence, the direct marketing retailers would want to allocate their resources efficiently to the channels and to provide a multi-channel shopping environment for their customers. The prediction results of the proposed model show that it is important to include different relative benefits of using channels in the channel choice model. When there are many products to order in the shopping basket, the consumers are more likely to use the Internet channel, but in an extended problem-solving situation, the consumers are more likely to use the Call Center. When the value of the shopping basket is high and there are products ordering for the first time, they are more likely to use the Call Center than the Internet channel. The proposed model would enable the direct marketing retailers to predict the relative channel share for its efficient allocation of its resources.
119
Channel Share Prediction in Direct Marketing Retailing
Table 2. Estimation Results.
Channel access cost
Basket-related costs Basket size
Value
First
Situational factors Jan
July
Time
Estimate
Standard Error
Internet Mail Call Center VRS
2.842 1.799 27.810 3.200
2.2340 0.4943 4.6580 0.5088
Internet Mail Call Center VRS Internet Mail Call Center VRS Internet Mail Call Center VRS
5.947 0.486 1.020 3.536 0.517 0.091 0.931 1.140 4.447 0.942 10.470 5.547
1.3760 0.1575 0.5450 1.0110 0.1542 0.0944 0.0745 0.4742 1.3780 0.6740 3.4080 3.7200
Internet Mail Call Center VRS Internet Mail Call Center VRS Internet Mail Call Center VRS
1.561 0.235 12.880 0.901 0.125 0.071 1.894 0.381 0.369 0.060 0.420 0.479
2.6210 0.6822 4.8700 3.4050 2.4840 0.4354 3.6300 3.1420 0.1014 0.0159 0.1100 0.1097
The Estimates are statistically significant at a ¼ .05.
CONCLUSION The current study proposes a multi-channel choice model that investigates the relative benefits of using different channels and demonstrates that such model structure would produce a better prediction of the channel share than the model without such structure. The empirical results also show the relative benefits of using Internet channel vs. traditional channels.
120
EDDIE RHEE
For future research, it would be managerially relevant to study the channel choice behavior for multiple product categories. For different product categories in the shopping basket, the effect of the value of and the familiarity to the products might show different results. Also, decomposing the purchase process into information gathering and ordering stages would provide better prediction and further insights about the channel choice behavior.
REFERENCES Alba, J. W., & Hutchinson, J. W. (1987). Dimensions of consumer expertise. Journal of Consumer Research, 13(March), 411–454. Ansari, A., Mela, C. F., & Neslin, S. A. (2008). Customer channel migration. Journal of Marketing Research, 45(1), 60–76. Balasubramanian, S., Raghunathan, R., & Mahajan, V. (2005). Consumers in a multi-channel environment: Product utility, process utility and channel choice. Journal of Interactive Marketing, 19(2), 12–30. Biyalogorsky, E., & Naik, P. (2003). Clicks and mortar: The effect of online activities on offline sales. Marketing Letters, 14(1), 21–32. David Sheppard Associates, Inc. (1999). The new direct marketing: How to implement a profitdriven database marketing strategy. Boston: McGraw-Hill. Dleersnyder, B., Inge, G., Gielens, K., & Dekimpe, M. G. (2002). How cannibalistic is the internet channel? A study of the newspaper industry in the United Kingdom and the Netherlands. International Journal of Research in Marketing, 19(4), 337–348. Kumar, V., & Venkatesan, R. (2005). Who are the multichannel shoppers and how do they perform?: Correlates of multichannel shopping behavior. Journal of Interactive Marketing, 19(2), 44–62. Payne, J., Bettman, J. R., & Johnson, E. J. (1993). The adaptive decision-maker. New York: Cambridge University Press. Shankar, V., Smith, A., & Rangaswamy, A. (2003). The relationship between customer satisfaction and loyalty in online and offline environments. International Journal of Research in Marketing, 20(2), 153–175. Verhoef, P. C., & Donkers, B. (2005). The effect of acquisition channels on customer retention and cross-buying. Journal of Interactive Marketing, 19(2), 31–43.
PREDICTING A NEW BRAND’S LIFE CYCLE TRAJECTORY Frenck Waage ABSTRACT A company is developing a new product and wants an accurate estimate of the investment’s ROI. For that the money in-flows and out-flows for the project have to be forecasted. And to develop those forecasts, the resulting product’s life cycle must first be forecasted. In this chapter, we are considering a real company. The company was in the process of developing a new product – a special purpose computer. In June of a year, the company wished to predict the product’s future life cycle before the product had been fully developed. The product would be introduced into the market in January of the following year. However, to predict the locus of a product’s future life cycle before the product has been fully developed is known to be very difficult. This chapter presents a method for predicting a new brand’s life cycle trajectory from its beginning to its end before the brand is introduced into the markets. The chapter also presents a combination of two methods to use current information to revise the entire predicted trajectory so it comes closer and closer to the true life cycle trajectory. The true trajectory is not known till the product is pulled from the market. The two methods are the Delphi method and Kalman filtering tracking method. The company, which this application originates with, and the problem we discuss are real. However, we are prepared to identify neither the company nor the product. This chapter discusses the approach, but the Advances in Business and Management Forecasting, Volume 7, 121–134 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007011
121
122
FRENCK WAAGE
data and the time scale have been masked, so that no identification from the data is possible.
LITERATURE The Delphi method is guided decision making by a group of experts originally developed by the RAND corporation in 1959. The Delphi method is capable of delivering very good answers to problems that cannot be modeled analytically and that do not have ‘‘optimal’’ solutions. In this chapter the Delphi method is used to solve one such difficult forecasting problem, namely that of forecasting the trajectory of a life cycle curve. Early studies of the method are found in Gordon and Hayward (1968) and Helmer (1977). Authors have commented on how to evaluate whether or not the Delphi technique is appropriate for various classes of problems (Adler & Ziglio, 1996). The method has been thoroughly surveyed and evaluated (Rowe & Wright, 1999). It has been strongly supported (Wissema, 1982; Helmer, 1981). This chapter forecasts the life cycle trajectory of a new product before the product is introduced into the market. We use Kalman filter technique to improve the initial Delphi forecast of the trajectory. Over time as new information becomes available we make use of Kalman filter to improve our estimates of the locus of the life cycle forecast. Filtering techniques are not new and belong in the field of Stochastic Control Theory (Antoniou, 1993; Aoki, 1967; Astrom, 1970). Kalman filter was introduced in this paper (Kalman, 1960). This paper has been followed by very many papers and books (among them are Brown & Hwang, 1992; Chui & Chen, 1987; Cipra, 1993; Mehra, 1979; Proakis & Manolakis, 1992; Sorensen, 1970). More are listed in the bibliography.
THE DELPHI EXPERIMENT Selection of the Delphi Team A new special purpose computer was being developed. The development process was guiding the development by concurrent decision-making development methods and quality function deployment techniques. Management wanted to know the new product’s future life cycle before the product had been introduced into the markets – the planned introduction date was January 2.
Predicting a New Brand’s Life Cycle Trajectory
123
The company’s leadership committed the resources needed to predict the new product’s life cycle. The approach to be used was the Delphi method. The Delphi team was formed, consisting of the nine individuals: the vice president of development (project leader), two lead scientists, two lead engineers, four knowledgeable individuals from each of finance, marketing, human relations, and an economic forecasting analysis group. In addition, these nine individuals could commandeer additional resource if needed when needed.
Preparing the Delphi Team The Delphi method is a decision-making tool that a group of experts uses. The method is applied to solve problems that do not possess an optimal analytical solution. For such problems, the experts on a Delphi team are asked to deploy their intuition and their judgments in a guided process that usually arrives at an acceptable decision. Experience has taught us that the method is capable of yielding satisfactory to very good decisions. To start the Delphi effort off correctly, the experts who would be selected to serve on the Delphi panel were engaged in a significant learning effort. The objective of the effort was to develop a deep insight and understanding into the dynamics of product life cycle competition in the markets where our new product would enter. The training included studies of how customers assign value to each competing product, how the relative value of any brand drives demand for that brand, what the characteristics of the market and of the market segment are, and of how a sequence of product life cycle forms. To accomplish this objective, the panel members had to complete a sequence of seminars led by experts.
The Objective of the Delphi Experiment The Delphi team was assembled to forecast the locus of the new product’s life cycle trajectory from start till finish. Benefiting from hindsight we can describe the problem using the results from the Delphi experiment in Fig. 1. At time zero the dominant growth product was product B in Fig. 1. A minor competitor is the declining product A. Our new brand ‘‘C’’ is designed to enter this market with a value advantage. When introduced at time zero, the value advantage will drive the customers to switch their purchases from ‘‘A’’ and ‘‘B’’ to ‘‘C.’’ ‘‘C’’
124
FRENCK WAAGE
Fig. 1.
Life Cycle Trajectories for Competing Products A, B, C, D.
becomes the new dominant growth product as ‘‘A’’ and ‘‘B’’ decline toward oblivion. A new brand ‘‘D’’ enters the market at t ¼ 22 with a value advantage over ‘‘C.’’ ‘‘D’’ will become the new growth product as ‘‘C’’ starts its decline toward oblivion. In general, the growth of a growth brand can only be stopped and reversed into a decline by the introduction of a new competing growth brand.
Develop Information Packet and Questionnaire for Delphi Panel Members To start off the Delphi experiment’s first round, the members of the Delphi panel were each given a packet with the following information in it: (1) The problem. (2) Company forecast of the total demand in the market from t ¼ 0 to t ¼ 40. (3) Historical data of the trajectories of products ‘‘A’’ and ‘‘B’’ up to t ¼ 0. (4) The relationship between relative value scores and market shares. (5) Estimates of the value scores of each competing product ‘‘A,’’ ‘‘B,’’ ‘‘C’’ at t ¼ 0. (6) An initial guess about the moment the next competitor will be introduced. (7) The table that had to be completed with the required forecast in it.
125
Predicting a New Brand’s Life Cycle Trajectory
Enter your ‘‘Round 1’’ forecasts in this table. Month
A(t)
RV(t)
B(t)
RV(t)
C(t)
RV(t)
D(t)
RV(t)
TOTAL
1 2 3
Commentary and instructions: Explicitly predict the amount of time that product ‘‘C’’ will spend in each phase. Explicitly predict the number of systems that will be sold in each of the phases ‘‘introduction,’’ ‘‘growth,’’ ‘‘peak or maturity,’’ and ‘‘decline.’’ Explicitly explain how much of your forecasted sales will come from growth in the total market, and how much will come from customers who switch from buying ‘‘A’’ or ‘‘B’’ to buying ‘‘C’’ per month. Explicitly predict when the peak sales for ‘‘C’’ will occur, and how large sales will be at the peak. Make explicit judgments on the time when a new ‘‘killer’’ brand (such as ‘‘D’’) will enter the market, and make guesses on the impact ‘‘D’’ will have on ‘‘C.’’ Delphi Round 1 Each member received the packet, each member worked to answer all inquiries within the allotted time, and each member forwarded the completed forecast and information forms to the Delphi experiment’s coordinator.
Analysis of the Results from Round 1 and Feedback Having all the nine forecasting forms before him, the coordinator then plotted the nine forecasts and found no consensus. The coordinator decided that the responses should be sorted into two groups: one a ‘‘low trajectory’’ forecast and the other a ‘‘high trajectory’’ forecast as shown in Fig. 2. Fig. 2 took the data for the ‘‘high trajectory’’ from Table 1. The data for the ‘‘low trajectory’’ are not shown.
126
FRENCK WAAGE
Fig. 2.
Delphi Round 1 ‘‘High’’ and ‘‘Low’’ Life Cycle Trajectories.
Table 1.
The ‘‘High Trajectory’’ from Delphi Round 1.
Month Sales of Lagged (t) Product ‘‘C’’ Sales (Xt1) (Xt) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
111 145 186 252 410 587 697 882 923 1074 1350 1477 1700 1839 1988 2183 2263 2373
0 111 145 186 252 410 587 697 882 923 1074 1350 1477 1700 1839 1988 2183 2263
Driving Force (Ut1) 0.6 1.0 1.6 2.2 3.0 3.7 4.7 5.9 8.0 11.0 16.0 22.0 29.0 37.0 44.0 49.0 53.0 57.0
Month Sales of Lagged (t) Product ‘‘C’’ Sales (Xt1) (Xt) 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
2510 2523 2524 2508 2400 2245 2034 1798 1633 1532 1418 1260 1151 976 721 587 402 263
2373 2510 2523 2524 2508 2400 2245 2034 1798 1633 1532 1418 1260 1151 976 721 587 402
Driving Force (Ut1) 60.0 61.0 60.0 58.0 50.0 40.0 31.0 23.0 17.0 13.0 10.0 8.0 6.0 4.0 2.0 1.0 0.2 0.0
127
Predicting a New Brand’s Life Cycle Trajectory
The coordinator decided that the Delphi experiment would require one more round. The coordinator then typed into his report for the first round the two different trajectories and for each of the two the significant reasons the Delphi panel members had given for making their judgments. The coordinator included in the revised information packet a modified questionnaire and the forecasting table that each panel member would have to respond to in round 2. Enter your ‘‘Round 2’’ forecasts in this table. Month
A(t)
RV(t)
B(t)
RV(t)
C(t)
RV(t)
D(t)
RV(t)
TOTAL
1 2 3
Delphi Round 2 Each member received his copy of the report and the modified questionnaire and was asked to complete Delphi round 2. Each member answered all inquiries within the allotted time and forwarded the completed forecast and information forms to the Delphi experiment’s coordinator.
Analysis of the Results from Round 2, Feedback and an Additional Round 3 Round 2 brought the two trajectories closer. The coordinator decided, nevertheless, that a convergence on a consensus had not occurred. He therefore launched a round 3. The result from round 3 was judged to be close enough to be called a consensus. It was the average locus (center of gravity) from the nine panel members shown in Fig. 3.
Further Delphi Experiments This first Delphi prediction was made at the end of June, six months before launch. Product ‘‘C’’ was launched on January 2. At the end of the first quarter after January the Delphi experiment was repeated (round 2) to generate a life cycle forecast from April 1 through to the end of the cycle. The Delphi experiment was repeated at the end of the second quarter
128
FRENCK WAAGE
Fig. 3.
The Final View of the Product Life Trajectory.
(round 3) and at the end of every subsequent quarter to forecast the rest of the life cycle. The result was a rolling view of the life cycle. It proved very useful to the planners. The Quarterly Report Following Each Delphi Experiment At the end of each quarter, and the end of each corresponding Delphi experiment, the coordinator issued a report that recorded the most current product ‘‘C’’ life cycle trajectory forecast. The report also contained the mathematical state space function (1) that was fitted to the Delphi panel’s consensus forecast at the end of every quarter. The notation we use in the state space equation has this meaning: while in month t1 the Delphi experiment predicted that sales in month t will be measured by the variable Xt,t1. X kt ¼ 0:80X k;t1 þ 7:50U k;t þ 106:50
(1)
THE KALMAN FILTER’S ESTIMATE OF THE TRUE LOCUS OF THE LIFE CYCLE TRAJECTORY So far, expert judgments have been the only basis for identifying the life cycle trajectory. However, one month after the introduction, we had
129
Predicting a New Brand’s Life Cycle Trajectory
observations of actual sales. Two months after the introduction, we had another month’s numbers of observed sales. To learn what the observed actual sales said about where the life cycle trajectory was trending, we used the observed sales and Kalman filter to tell us where the sales trajectory really was. We now discuss how this updating was accomplished. The Kalman filter addresses the problem of predicting at time (t1) the state Xt,t1 that a moving object (sales) will occupy at the next time t when the prediction is made from the state space equation: X t;t1 ¼ At1;t1 X t1;t1 þ Bt1;t1 U t1;t1 þ Ct1 At time t, however, we can improve this estimate Xt,t1 because new information on the actual sales is captured by Yt in the next equation which relates the two Y t ¼ H t1 X t;t1 The Appendix explains the workings of the Kalman filter. The predicted sale for January (Delphi forecast, Table 2) was X1,0 ¼ 111. The actual sales was Y1 ¼ 51. Kalman filter says that the real locus of the life cycle trajectory is at X1,1 ¼ 96. The predicted sale for February (Delphi forecast, Table 2) was X2,1 ¼ 145. The actual sales was Y2 ¼ 203. Kalman filter says that the real locus of the life cycle trajectory is at X2,2 ¼ 199. The predicted sale for March (Delphi forecast, Table 2) was X3,2 ¼ 186. The actual sales was Y3 ¼ 349. Kalman filter says that the real locus of the life cycle trajectory is at X3,3 ¼ 291. In this sequential application of Kalman filter, we get after each month has passed a revised and better estimate of exactly where the life cycle trajectory is. Table 2. Month
1 2 3
Sales Tracking.
Prior Predicted Sales (Xt,t1)
Observed Actual Sales (Yt)
Kalman Revised Locus of Sales (Xt,t)
111 145 186
50 200 350
93 199 291
130
FRENCK WAAGE
Table 3. Month (t)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
The True Locus of the Product Life Cycle Trajectory.
Sales (Xt)
Lagged Sales (Xt1)
Driving Force (Ut1)
Month (t)
Sales (Xt)
Lagged Sales (Xt1)
Driving Force (Ut1)
93 203 349 600 1000 1200 1400 1600 1800 1950 2100 2200 2210 2240 2250 2300 2270 2240
0 93 203 349 600 1000 1200 1400 1600 1800 1950 2100 2200 2210 2240 2250 2300 2270
0.6 1.0 1.6 2.2 3.0 3.7 4.7 5.9 8.0 11.0 16.0 22.0 29.0 37.0 44.0 49.0 53.0 57.0
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
2200 2190 2000 1800 1600 1400 1200 1000 900 800 700 500 300 200 100 50 0 0
2240 2200 2190 2000 1800 1600 1400 1200 1000 900 800 700 500 300 200 100 50 0
60.0 61.0 60.0 58.0 50.0 40.0 31.0 23.0 17.0 13.0 10.0 8.0 6.0 4.0 2.0 1.0 0.2 0.0
THE FINAL DELPHI RESULTS The correct locus was later learned to be the locus given in Table 3. That locus is plotted in Fig. 3 together with the first Delphi view. The comparison shows that the first Delphi View developed at time t ¼ 0 had peak sales at 2,500 systems, whereas the true maximum was 2,270 systems. The first view had the sales crest occur in month 21 while the crest occurred in month 16. The paths of the growth and the declining phases differed somewhat. The value here was not the rolling view of the life cycle as it, and our estimates of it, changed over time.
RESULTS The main result was to create a method that would generate rolling views of the life cycle as it itself is being created and changed in the market place. The errors are big. Notwithstanding that, the information that the rolling views added to the decision makers’ information proved that the Delphi effort was worth its while.
Predicting a New Brand’s Life Cycle Trajectory
131
REFERENCES Adler, M., & Ziglio, E. (1996). Gazing into the Oracle. Bristol, PA: Jessica Kingsley Publishers. Antoniou, A. (1993). Digital filters. New York: McGraw-Hill Company. Aoki, M. (1967). Optimization of stochastic systems – topics in discrete-time systems. New York: Academic Press. Astrom, K. J. (1970). Introduction to stochastic control theory. New York: Academic Press. Brown, R. G., & Hwang, P. Y. C. (1992). Introduction to random signals and applied Kalman filtering (2nd ed.). New York: Wiley. Chui, C. K., & Chen, G. (1987). Kalman filtering with real-time applications. Heidelberg: Springer-Verlag. Cipra, B. (1993). Engineers look to Kalman filtering for guidance. SIAM News, 26(5), 74–103. Gordon, T. J., & Hayward, H. (1968). Initial experiments with the cross-impact matrix method of forecasting. Futures, 1(2), 100–116. Helmer, O. (1977). Problems in futures research: Delphi and causal cross-impact analysis. Futures, February, pp. 17–31. Helmer, O. (1981). Reassessment of cross-impact analysis. Futures, October, pp. 389–400. Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. ASME, Trans. Journal of Basic Engineering (82), 34–45. Mehra, R. K. (1979). Kalman filters and their applications to forecasting. In: TIMS Studies in the Management Sciences (Vol. 12, pp. 75–94). Amsterdam and New York: North-Holland Publishing Co. Proakis, J. G., & Manolakis, D. G. (1992). Digital signal processing (2nd ed.). New York: Macmillan Publishing. Rowe, G., & Wright, G. (1999). The Delphi technique as a forecasting tool: Issues and analysis. International Journal of Forecasting, 15, 353–375. Sorensen, H. W. (1970). Least squares estimation: From Gauss to Kalman. IEEE Spectrum, 7(July), 63–68. Wissema, J. G. (1982). Trends in technology forecasting. R&D Management, 12(1), 27–36.
APPENDIX A. THE KALMAN FILTER Forecast Xt,t1 at time t1 X t;t1 ¼ F t X t1;t1 þ Gt U t1;t1 þ W t1
(A.1)
Qt1 ¼ E ðW t1 0Þ2 ¼ 0
(A.2)
E X t;t1 ¼ X t;t1 ¼ F t X t1;t1 þ Gt U t1;t1
(A.3)
2 VAR X t;t1 ¼ Pt;t1 ¼ E X t X t;t1 ¼ F t Pt1;t1 F Tt þ Qt1
(A.4)
132
FRENCK WAAGE
For example, if t ¼ 1 E X 1;0 ¼ X 1;0 ¼ F t X 0;0 þ Gt U 0;0 At the end of t1, Yt becomes known which is used to revise the sales estimate of Xt,t1 Y t ¼ H t X t;t1 þ V t
(A.5)
from which we calculate the error V t ¼ H t X t;t1 Y t
(A.6)
Rt ¼ E ðV t 0Þ2
(A.7)
Kt ¼
H t Pt;t1 H t Pt;t1 H Tt þ Rt
(A.8)
Producing the revised forecasts X t;t ¼ X t;t1 þ K t Y t H t X t;t1 2 Pt;t ¼ E X t X t;t ¼ ðI K t H t ÞPt;t1 ðI K t H t ÞT þ K t Rt K Tt
(A.9) (A.10)
For example, if t ¼ 1 X 1;1 ¼ X 1;0 þ K 1 Y 1 H 1 X 1;0
(A.11)
P1;1 ¼ ðI K 1 H 1 ÞP1;0 ðI K 1 H 1 ÞT þ K 1 R1 K T1
(A.12)
Proof. Proof that the covariance Pt;t1 ¼ E½X t X t;t1 2 ¼ F t Pt1;t1 F Tt þQt1 First, substitute from Eqs. (A.1) and (A.2) into EðX t X t;t1 Þ2 to obtain 2 Pt;t1 ¼ E F t X t1 þ Gt U t1 F t X t1;t1 Gt U t1 W t1 Cancel the Gt U t1
2 Pt;t1 ¼ E F t X t1 X t1;t1 W t1
Predicting a New Brand’s Life Cycle Trajectory
133
complete the square, take expectations n h 2 i o Pt;t1 ¼ E F X t1 X t1;t1 F Tt
E 2W t1 F X t1 X t1;t1 þ E ðW t1 Þ2 Use the facts: EðX t1 X t1;t1 Þ2 ¼ Pt;t ; EðW t1 Þ ¼ 0:00; EðW t1 Þ2 ¼ Qt1 and get Pt;t1 ¼ F t Pt1;t1 F Tt þ Qt1 QED
APPENDIX B. PROOF OF THE KALMAN FILTER Proof. Proof that the covariance Pt;t ¼ EðX t X t;t Þ2 ¼ Pt;t1 ðI K t H t Þ2 þ K 2t Rt First, substitute from Eq. (A.8) into the error covariance Pt;t ¼ EðX t X t;t Þ2 to obtain 2 Pt;t ¼ E X t X t;t1 K t Y t H t X t;t1 Substituting into this result from Eq. (A.5) to replace Yt obtaining 2 Pt;t ¼ E X t X t;t1 K t H t X t ; þV t H t X t;t1 Collecting terms this simplifies to
2 Pt;t ¼ E ðI K t H t Þ X t X t;t1 K t V t Square, take expectations, and obtain h 2 i Pt;t ¼ E ðI K t H t Þ2 X t X t;t1
þ E ðK t V t Þ2 E 2K t V t ðI K t H t Þ X t X t;t1 Use the facts EðX t X t;t1 Þ2 ¼ Pt;t1 ; E ðV t Þ ¼ 0:00; and E ðV t Þ2 ¼ Rt and obtain Pt;t ¼ ðI K t H t ÞPt;t1 ðI K t H t ÞT þ K t Rt K Tt QED Proof. Proof that the optimal value of Kt is K t ¼ ðH t Pt;t1 Þ=ðH 2t Pt;t1 þ Rt Þ We now find the value of Kt that minimizes Pt,t. To calculate it, we differentiate Pt,t with respect to Kt, equate the result to zero, and solve
134
FRENCK WAAGE
for Kt. Start with Eq. (A.10). Pt;t ¼ ðI K t H t ÞPt;t1 ðI K t H t ÞT þ K t Rt K Tt
¼ Pt;t1 I 2K t H t þ ðK t H t Þ2 þ K t Rt K Tt dPt;t =dK t ¼ 2H t Pt;t1 þ 2K t H 2t Pt;t1 þ 2K t Rt ¼ 0 solving for Kt we get
H t Pt;t1 H t þ Rt K t ¼ Pt;t1 H t
Kt approaches zero as Rt approaches N. Kt approaches 1.00 as Rt approaches ðH t Pt;t1 ÞðI H t Þ. Kt approaches 1/H as Rt approaches zero.
QED
PART III METHODS AND PRACTICES OF FORECASTING
FORECASTING PERFORMANCE MEASURES – WHAT ARE THEIR PRACTICAL MEANING? Ronald K. Klimberg, George P. Sillup, Kevin J. Boyle and Vinay Tavva ABSTRACT Producing good forecast is a vital aspect of a business. The accuracy of these forecasts could have a critical impact on the organization. We introduce a new, practical, and meaningful forecast performance measure called percentage forecast error (PFE). The results of comparing and evaluating this new measure to traditional forecasting performance measures under several different simulation scenarios are presented in this chapter.
INTRODUCTION Forecasting, whether it is forecasting future demand, sales, production, etc., is an integral part of almost all business enterprises. ‘‘Every time we develop a plan of any type, we first make a forecast. This is true of individuals, profit and nonprofit companies, and government organizations; in fact, it is true of any entity that makes a plan’’ (Mentzer & Bienstock, 1998). ‘‘Forecasts not only have direct and indirect impact on almost every activity within the Advances in Business and Management Forecasting, Volume 7, 137–147 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007012
137
138
RONALD K. KLIMBERG ET AL.
organization, but these forecasts can have an external effect, such as on stock value and perceived consumer confidence’’ (Lawrence et al., 2008). The accuracy of these forecasts is important (Mentzer & Bienstock, 1998; Rahmlow & Klimberg, 2002). Inaccurate forecasts are costly, from possible increased costs (e.g., inventory and personnel costs) to decrease in customer satisfaction. Furthermore, these inaccuracies could endanger customers in specialized industries, such as the pharmaceutical industry where a backordered product can jeopardize a patient’s health. The more accurate the forecasts, the more confidence users have in the forecast and the forecasting process (Rahmlow & Klimberg, 2002). As markets change and competition increases, the accuracy of these sales forecasts can provide an important competitive advantage. A model’s forecasts are almost never 100% accurate. A forecast may be slightly higher or slightly lower than the actual value, depending on how good the forecasting model is. The difference between a forecast value and its corresponding actual value is the forecast error: Forecast error ¼ Yt–Ft, where Yt is the actual value and Ft the forecasted value. The forecast error measures the accuracy of an individual forecast. In the next section, we will briefly review some of the primary forecast performance measures. This discussion will provide motivation for our new performance measure which we call the percentage forecast error (PFE). In the subsequent section, the structure of our simulation experiments to test the performance of the traditional measures and our PFE measure will be discussed. In addition, the results of the simulations are presented. In the final section, we will discuss the conclusions of our study and future research directions.
FORECASTING PERFORMANCE MEASURES Forecasting performance measures can be classified into two types: directional and size. The bias is the primary measure that evaluates the direction of the error and hence the degree by which a forecasting model yields forecasts which either over or under estimate the actual values. The bias is the average of the forecast errors: P ðY t F t Þ Bias ¼ . n The expected value of the bias is 0, and the closer to zero, the better the model. If the bias is positive, the actual values tend to be, on average,
Forecasting Performance Measures – What are Their Practical Meaning?
139
greater than the forecast value; so that the forecasting model tends to underestimate the actual values. Conversely, if the bias is negative, the model tends to overestimate the actual values. There are several forecasting performance measures used to evaluate the size of the error. When considering the size of the error different dimensions may be addressed – the amount of error, the dispersion of the error, or the relative magnitude the error. Since most forecasting models tend to be unbiased, i.e., have an expected value near 0, the size of error is of most concern to management. One of the most popular forecasting performance measures of the size of the error is the mean absolute deviation (MAD). The MAD is calculated as the average of the absolute errors: P jY t F t j , MAD ¼ n where t is the time period, n the number of periods forecasted, Yt the actual value in time period t, and Ft the forecast value in time period t. The MAD value measures the amount of error. The smaller the MAD value, the better. Another primary forecasting performance measure of the size of the error is the mean square error (MSE). The MSE is calculated as the average of the sum of the square of forecast errors: P ðY t F t Þ2 . MSE ¼ n The MSE value measures the amount of dispersion of the errors. As with the MAD, the smaller the MSE value the better. The square root of the MSE results in the standard deviation of the errors or standard error (se) and is sometimes called the root mean square error (RMSE). An assumption of most forecasting models is that the errors follow a normal distribution with a mean of zero (which would be measured by the bias) and a certain standard deviation which is estimated by the se or RMSE. The smaller the MAD values or the MSE values, the more accurate the forecasting model, and vice versa, the larger the values more inaccurate the model. A major shortcoming of the MAD and MSE forecasting measures is that they do not take into consideration the magnitude of actual values. To illustrate, presume there is a data set with a MSE of 100 and, hence, a standard error of 10, se ¼ 10. If the errors then follow a normal distribution with a mean of zero, using the Empirical Rule, approximately 95% of
140
RONALD K. KLIMBERG ET AL.
the values will fall within a range 720 ¼ 210 from the mean of zero. Additionally, values would range 720 from the estimated value of the next time period value tþ1, Y^ tþ1 . That is, one would be able to state that one is highly confident (95%) that the forecast would be in 720 of Y^ tþ1 . Next, assume Y^ tþ1 ffi 100. One can now say that one is 95% confident that the forecast will be between 80 and 120. Conversely, assume that Y^ tþ1 ffi 1; 000. In this case, one can say one is 95% confident that the forecast will be between 980 and 1,020. Even though there is the same se, the second situation is a more accurate forecast, due to the magnitude of the actual values. As a result, a major practical limitation of the MAD and MSE measures is that there is no context to know when there is an ‘‘accurate’’ model or not. A widely used evaluation of forecasting methods which does attempt to consider the effect of the magnitude of the actual values is the mean absolute percentage error (MAPE). The MAPE is calculated as: the average of the absolute values of percentage errors: P ðjY t F t jÞ=Y t MAPE ¼ . n As with MAD and MSE performance measures, the lower the MAPE, the more accurate the forecast model. A scale to judge the accuracy of model based on the MAPE measure was develop by Lewis (1982) and is shown in Table 1. The MAPE, value and applying Lewis’s scale, provides some framework to judge the model. However, depending of the data set, as to whether there is a significant trend or seasonal component, the MAPE may under- or overestimate the accuracy of the model. Except for Lewis’s judgment scale (and what does it mean practically to have a ‘‘good’’ or ‘‘reasonable’’ forecast accuracy), the conventional forecast performance measures have no real-world or business meaning or context. That is, what does it mean to have a MAD (or MSE or MAPE) of 20, except that the smaller the better.
Table 1.
A Scale of Judgment of Forecast Accuracy (Lewis, 1982).
MAPE
Judgment of Forecast Accuracy
Less than 10% 11% to 20% 21% to 50% 51% or more
Highly accurate Good forecast Reasonable forecast Inaccurate forecast
Forecasting Performance Measures – What are Their Practical Meaning?
141
This situation was the motivation for us to develop a measure we call thePFE (Klimberg & Ratick, 2000): PFE ¼
2 se 100%, Y^ tþ1
where se is the standard error and Y^ tþ1 the forecasted value for the next time period, tþ1. The PFE is somewhat similar to the coefficient of variation (CV) in which one measures the relative dispersion around the mean. The CV is an ideal measurement for comparing the relative variation of two or more data sets, especially when they may be measured in different units. An advantage of the CV is that, regardless of the units of measure, the CV equation cancels the units out and produces a percentage. With the PFE, there is a similar ratio except that in the numerator of the PFE measure the standard error is multiplied by 2. As a result, the resulting measure is two standard deviates away from the mean in conjunction with the Empirical Rule. Accordingly, the PFE value allows one to say, with a high level of certainty (actually 95%), that the forecast for the next time period will be within PFE% of the actual value. In other words, one is highly certain that the forecast will be within 20% of the actual value. The denominator is equal to the best estimate of Ytþ1. The resulting PFE value provides a conservative estimate of the accuracy of the model. The actual value of whatever is being forecast for the next period is most likely closer than what the PFE indicates.
SIMULATIONS Simulations were produced based on the eight distributions, (listed in Table 2) and the six component variations, (listed in Table 3), to test the performance of this forecasting performance measure, PFE. We ran a total of 48 simulations, one for each combination of these distributions and component variations, for 100 periods. Before conducting the simulations, we had several expectations. The dispersion of the data would not change but the magnitude of the values would increase; thus, the more accurate the performance measure. For a given distribution, a performance measure should become more accurate with the increase in values, i.e. the greater the slope, the more accurate the performance measure would become (assuming a positive slope or increasing values). Ideally, examining a measure over the distribution variations, we would expect the measures to follow their
142
RONALD K. KLIMBERG ET AL.
Table 2.
List of Distributions and Their Corresponding Coefficient of Variation (CV) Values.
Distribution Variation A B C D E F G H
Distribution
Coefficient of Variation (CV)
Uniform: 50 to 150 Uniform: 950 to 1050 Normal: m ¼ 100, s ¼ 1 Normal: m ¼ 100, s ¼ 10 Normal: m ¼ 100, s ¼ 20 Normal: m ¼ 1,000, s ¼ 1 Normal: m ¼ 1,000, s ¼ 10 Normal: m ¼ 1,000, s ¼ 20
0.289 0.029 0.01 0.10 0.20 0.001 0.01 0.02
Table 3. Component Variation I II III IV V VI
Trend – Slope Slope – Slope Slope
Table 4.
1 10 1 10
Seasonal – – – Quarterly Quarterly Quarterly
Sorted by CV Values (Lowest to Highest), a List of Distribution Variations.
Distribution Variation F G C H B D E A
List of Component Variations.
Distribution
CV
Normal 1000-1 Normal 1000-10 Normal 100-1 Normal 1000-20 Uniform 950-1050 Normal 100-10 Normal 100-20 Uniform 50-150
0.001 0.01 0.01 0.02 0.029 0.10 0.20 0.289
corresponding CV values, i.e., the lower the CV value the lower the performance measure, as sorted in Table 4. Further, distribution B (uniform ranging from 950 to 1050) should be more accurate than distribution A (uniform ranging from 50 to 150); distribution F more accurate than
143
Forecasting Performance Measures – What are Their Practical Meaning?
Table 5. Group 1 2 3
Expected Ranking of Component Variations (Lowest to Highest). Component Variation I IV II V III VI
Trend – – Slope Slope Slope Slope
1 1 10 10
Seasonal – Quarterly – Quarterly – Quarterly
distribution C; distribution G more accurate than distribution D; and distribution H more accurate than distribution E. In terms of the component variations, we expected the forecasting performance measures to be grouped and to closely follow the order shown in Table 5. The results from the simulations are summarized in Tables 6 and 7. The rankings of the average simulation values for each forecast performance measurement, i.e., MAD, MSE, MAPE, and PFE, for each distribution are listed in Table 6. The higher the ranking, the lower the average and, hence, the more accurate the model and vice versa, the lower the ranking the less accurate the model. To assist in evaluating the performance, the distributions (the columns in Table 6) were further sorted lowest to highest according to the distribution’s CV as shown in Table 4. As a result, one would expect the rankings to go from high to low looking from left to right in the table. The MAD and MSE results, just as the MAPE and PFE results, follow similar patterns. Overall, the MAD and MSE simulated averages tend to reasonably follow the expected pattern for the Normal 100-10 and Normal 1000-20 (D and E) and Normal 1000-2- and Uniform 950-1050 (H and B) seem to be out of order. However, the MAPE and PFE rankings do follow the expected pattern. (Note, columns Normal 1000-10 and Normal 100-1 (or G and C) could be interchanged since they do have exactly the same CV values.) In terms of the simulated averages for the component variations in Table 7, the rankings were sorted according to Table 5. The rankings should increase from the top of the table to the bottom. The component variation displayed similar results to the distribution variation, that is, the MAD and MSE reasonably follow the expected pattern while the MAPE and PFE certainly follow the expected pattern.
144
RONALD K. KLIMBERG ET AL.
Table 6. Ranking of Average Simulated Values for Each Distribution (Sorted According to Table 4). MAD
No trend Slope 1 Slope 10 Seasonal Seasonal þSlope 1 Seasonal þSlope 10 MSE No trend Slope 1 Slope 10 Seasonal Seasonal þSlope 1 Seasonal þSlope 10 MAPE No trend Slope 1 Slope 10 Seasonal Seasonal þSlope 1 Seasonal þSlope 10 PFE No trend Slope 1 Slope 10 Seasonal Seasonal þSlope 1 Seasonal þSlope 10
Normal 1000-1 F
Normal Normal Normal 1000-10 100-1 1000-20 G C H
Uniform 950-1050 B
Normal Normal Uniform 100-10 100-20 50-150 D E A
8 8 8 7 7
5 6 6 6 6
7 7 7 8 8
3 3 3 4 3
1 2 1 2 1
6 5 5 5 5
4 4 4 3 4
2 1 2 1 2
8
5
7
3
2
6
4
1
8 8 8 7 7
5 6 6 5 6
7 7 7 8 8
3 4 3 4 3
1 2 1 2 1
6 5 5 6 5
4 3 4 3 4
2 1 2 1 2
8
5
7
3
2
6
4
1
8 8 8 8 8
7 6 6 6 6
6 7 7 7 7
5 5 5 5 5
4 4 4 4 4
3 3 3 3 3
2 2 2 2 2
1 1 1 1 1
8
6
7
5
3
4
2
1
8 8 8 8 8
7 6 6 6 6
6 7 7 7 7
5 5 4 5 5
4 4 3 4 4
3 3 5 3 3
2 2 2 2 2
1 1 1 1 1
8
6
7
4
3
5
2
1
Forecasting Performance Measures – What are Their Practical Meaning?
Table 7.
145
Ranking of Average Simulated Values for Each Component (Sorted According to Table 5).
MAD
No trend I Seasonal IV Slope 1 II Seasonal V þSlope 1 Slope 10 III Seasonal VI þSlope 10 MSE No trend I Seasonal IV Slope 1 II Seasonal V þSlope 1 Slope 10 III Seasonal VI þSlope 10 MAPE No trend I Seasonal IV Slope 1 II Seasonal V þSlope 1 Slope 10 III Seasonal VI þSlope 10 PFE No trend I Seasonal IV Slope 1 II Seasonal V þSlope 1 Slope 10 III Seasonal VI þSlope 10
Uniform 50-150
Uniform 950-1050
Normal Normal Normal Normal Normal Normal 1000-1 1000-10 1000-20 100-1 100-10 100-20
4 3 1 6
1 6 2 5
2 5 4 3
1 4 5 6
2 6 4 3
1 5 4 6
4 3 2 5
2 3 6 5
5 2
3 4
1 6
2 3
1 5
2 3
1 6
1 4
4 3 1 6
1 6 5 2
2 3 4 6
1 3 5 6
2 6 5 3
1 5 4 6
4 2 3 6
4 2 3 5
5 2
3 4
1 5
2 4
1 4
2 3
1 5
1 6
2 1 3 4
1 2 4 3
1 2 4 3
1 2 3 4
2 3 4 1
1 2 3 4
2 1 3 4
2 1 3 4
6 5
6 5
5 6
6 5
5 6
5 6
5 6
6 5
2 1 3 4
1 2 4 3
1 2 3 4
1 2 3 4
1 2 4 3
1 2 3 4
2 1 3 4
2 1 3 4
6 5
5 6
5 6
5 6
5 6
5 6
5 6
5 6
146
RONALD K. KLIMBERG ET AL.
DISCUSSION The greater accuracy of PFE is useful in any industry because better top-line prediction leads to better bottom-line contribution. PFE’s increased accuracy is particularly important for demand forecasting in specialty industries, such as the chemical, biologic, fragrance, medical device, diagnostics, and pharmaceutical industries, because they face various technical, regulatory, and manufacturing issues that cause supply and demand uncertainties (Kiely, 2004). An example is adjustment for a supply-constrained drug for treating hemophilia A (Stonebraker & Keefer, 2009). Another example is an uninterrupted supply of product during a clinical study (Asher et al., 2007). Based on the accuracy of the simulations’ findings, PFE can also be utilized for the forecasting of global markets assuming availability of data comparable with the data used in the simulations (Choo, 2009).
CONCLUSIONS In this chapter, we ran numerous simulations varying the distribution and the component and evaluated the results of several major forecasting performance measures: MAD, MSE, and MAPE, and our measure PFE. The MAD and MSE measures reasonably followed the expected patterns; however, the MAPE and PFE measures absolutely followed the expected patterns. However, a major disadvantage of the values of the traditional measures is that they do not have any practical basis while the PFE does. In the future, further simulation testing, similar to this study, with larger data sets and more complex patterns should be performed. Opportune areas for further simulation testing are specialized industries in domestic and global markets for long-term strategic planning, evaluation of replacement products, e.g., licensing opportunities, assessment of changing market dynamics, or the impact of new competitors, such as branded vs. generic drugs.
REFERENCES Asher, D., Schachter, A. D., & Ramoni, M. F. (2007). Clinical forecasting in drug development. Nature Reviews Drug Discovery, 6(Feb), 107–108. Choo, L. (2009). Forecasting practices in the pharmaceutical industry in Singapore. The Journal of Business Forecasting Methods and Systems, 19(2), 18–22. Kiely, D. (2004). The state of the pharmaceutical industry supply planning and demand forecasting. The Journal of Business Forecasting Methods & Systems, 23(3), 20.
Forecasting Performance Measures – What are Their Practical Meaning?
147
Klimberg, R. K., & Ratick, S. (2000). A New Measure of Relative Forecast Error. INFORMS Fall 2000 Meeting, November, San Antonio. Lawrence, K., Klimberg, R., & Lawrence, S. (2008). Fundamentals of forecasting using excel (November, p. 2). New York, NY: Industrial Press. Lewis, C. D. (1982). Industrial and business forecasting methods: A radical guide to exponential smoothing and curve fitting (p. 305). London, Boston: Butterworth Scientific. Mentzer, J. T., & Bienstock, C. C. (1998). Sales forecasting management. Thousand Oaks, CA: Sage Publications. Rahmlow, H., & Klimberg, R. (2002). Forecasting practices of MBA’s. In: Advances in business and management forecasting (Vol. 3, pp. 113–123). Bingley, UK: Elsevier Science. Stonebraker, J. S., & Keefer, D. L. (2009). Modeling potential demand for supply-constrained drugs: A new hemophilia drug at Bayer biological products. Operations Research, 57(1), 19–31.
FORECASTING USING FUZZY MULTIPLE OBJECTIVE LINEAR PROGRAMMING Kenneth D. Lawrence, Dinesh R. Pai and Sheila M. Lawrence ABSTRACT This chapter proposes a fuzzy approach to forecasting using a financial data set. The methodology used is multiple objective linear programming (MOLP). Selecting an individual forecast based on a single objective may not make the best use of available information for a variety of reasons. Combined forecasts may provide a better fit with respect to a single objective than any individual forecast. We incorporate soft constraints and preemptive additive weights into a mathematical programming approach to improve our forecasting accuracy. We compare the results of our approach with the preemptive MOLP approach. A financial example is used to illustrate the efficacy of the proposed forecasting methodology.
INTRODUCTION An important problem facing decision makers in business organizations is the forecasting of uncertain events. The importance of having accurate forecasts available for decision-making is widely recognized at all levels of Advances in Business and Management Forecasting, Volume 7, 149–156 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007013
149
150
KENNETH D. LAWRENCE ET AL.
decision-making and in all functional areas of business. Because of the importance of the problem, multiple forecasts are often prepared for the same event. Decision makers have to evaluate the multiple forecasts with respect to a single objective and select the forecast method which comes closest to satisfying the chosen objective, while the remaining forecasts are discarded. Selecting a single forecast based on a single objective may not make the best use of available information for a variety of reasons. Although there is some overlap or redundancy of information likely among forecasts, discarded forecasts may contain information not available in the selected forecast. Combined forecasts may provide better fit with regard to a single objective than an individual forecast. Even if an individual forecast does provide a good fit with respect to a single objective, a combined forecast may provide a better fit with regard to multiple objectives. The purpose of this chapter is to investigate a fuzzy approach to a combination of several techniques for forecasting monthly sales to produce improved forecasts over those produced by more traditional single technique approaches. This chapter compares the forecast provided by combining the traditional forecasting with a fuzzy approach using soft constraints. The methodology used for combining the forecasts is preemptive multiple objective linear programming (MOLP).
LITERATURE REVIEW Combining forecasts, introduced by Bates and Granger (1969), is often considered as a successful alternative to using just an individual forecasting method. Empirical results demonstrate that no single forecasting method can generate the best forecasts in all situations and the relative accuracy of the different models varies with the origin/destination pairs and the lengths of the forecasting horizons (Witt & Song, 2002). Previous studies have shown that composite forecasting is useful in predicting variables of interest such as sales, corporate earnings per share, and tourists inflow (Reeves & Lawrence, 1982; Conroy & Harris, 1987; Huang & Lee, 1989; Wong, Song, Witt, & Wua, 2007). Wong et al. (2007) show that forecast combination can improve forecasting accuracy and considerably reduce the risk of forecasting failure. They conclude that combined forecasts can be preferred to single model forecasts in many practical situations. In a recent study, Hibon and Evgeniou (2005) have proposed a simple model-selection criterion to select among forecasts. Their results indicate that the advantage of combining
Forecasting Using Fuzzy Multiple Objective Linear Programming
151
forecasts is not only better results, but also that it is less risky in practice to combine forecasts than to select an individual forecasting method. In real life, the decision maker is always seeking to achieve different conflicting objectives. Therefore, the goal programming technique has been developed to consider such type of problem (Ignizio, 1982a,b). The goal programming approach is widely used in different disciplines for the purpose of decision-making. However, the fuzzy programming has been presented by Zimmermann (1978, 1983) in order to utilize the mathematical programming approaches in fuzzy environment. Numerous attempts have been made to develop the fuzzy goal programming (FGP) technique (Hannan, 1981, 1982; Ignizio, 1982a,b). An intensive review of FGP is presented by Chen and Tsai (2001). The first application of forecasting using fuzzy set theory to our knowledge appeared in Economakos (1979). The author applied fuzzy concepts on the computer simulation for power demand forecasting and loading of power systems. Since this initial research, the interest in fuzzy forecasting has grown considerably. Shnaider and Kandel (1989) develop a computerized forecasting system to forecast corporate income tax revenue for the state of Florida. Song and Chissom (1993) provide a theoretic framework for fuzzy time series modeling.
THE MODELS Multiple Objectives Linear Programming Preemptive MOLP techniques are used to generate efficient combined forecasts. The forecasting techniques utilized in the study are exponential smoothing, multiple regression, and harmonic smoothing. A basic MOLP model for combining forecasts can be formulated as follows. The decision variables in the model are defined as the weights to be assigned to each forecast: Wj ¼ The weight assigned to forecast j, j ¼ 1, 2,y, n. The coefficients of the model are the actual observed values and the forecasted values for each of the forecasts in each of the time periods considered: Ai ¼ The actual observed value in time period i, i ¼ 1, 2,y, m; i ¼ 1, 2,y, m, Fij The forecasted value by forecast j in time period i, j ¼ 1, 2,y, n.
152
KENNETH D. LAWRENCE ET AL.
The constraints of the model take the following form: P j¼1
þ F ij W j þ d i þ d i ¼ Ai
i ¼ 1; 2; . . . ; m
(1)
where d i is the underachievement by the combined forecast of the observed value in time period i and d þ i the overachievement by the combined forecast of the observed value in time period i, and n X
W j ¼ 1,
(2)
j¼1
W j 0;
j ¼ 1; 2; . . . ; n;
þ d i ; d i 0;
i ¼ 1; 2; . . . ; m:
In each time period, the weighted sum of the forecasted values plus or minus an error term must equal the actual observed value. The objectives of the model are expressed in terms of the underachievement and overachievement variables: Minimize:
þ
þ
þ
Z ¼ ½Z 1 ðd ; d Þ; Z 2 ðd ; d Þ; . . . ; Z k ðd ; d Þ,
(3)
where þ þ þ þ d ¼ ðd 1 ; d 2 ; . . . ; d m Þ; d ¼ ðd 1 ; d 2 ; . . . ; d m Þ.
The preemptive MOLP model objectives as formulated in (3) represent alternative measures of forecast error or accuracy. Alternative accuracy objectives allow decision makers to emphasize their preferences for the form and occurrence of forecast error. Two examples of such alternative measures are (1) the minimization of total forecast error over all time periods and (2) the minimization of the maximum forecast error in any individual time period. A third measure of error, which could be stated as an objective, is the minimization of forecast error in the most recent time periods. Thus, the overall MOLP model for combining forecasts consists of minimizing (3) subject to (1)–(2). The associated
Forecasting Using Fuzzy Multiple Objective Linear Programming
153
objective functions are Z1 ¼
30 P i¼1
þ ðd i þ d i Þ; 30 P
Z2 ¼
i¼1
Z3 ¼
30 P
dþ i ;
þ ðd i þ d i Þ:
i¼25
Fuzzy Approach Using Soft Constraints (Fuzzy I) The FGP model enlarges the feasible region by fuzzifying the constraints of the model with given tolerance values. Here, we use a combination of soft and crisp constraints instead of only crisp constraints in the MOLP. The fuzzy approach formulation is as follows (Fang, Hu, Wang, & Wu, 1999): Minimize: þ þ þ Z ¼ ½Z 1 ðd ; d Þ; Z 2 ðd ; d Þ; . . . ; Z k ðd ; d Þ,
subject to:
X
þ ~ F~ ij W j þ d i þ d i ¼ Ai
i ¼ 1; 2; . . . ; m
j¼1
and n P
W j ¼ 1; W j 0;
j ¼ 1; 2; . . . ; n
j¼1
þ d i ; d i 0;
where F~ ij ; A~ i ; i ¼ 1; 2; . . . ; m; of fuzzy sets.
i ¼ 1; 2; . . . ; m,
j ¼ 1; 2; . . . ; n, are fuzzy coefficients in terms
Fuzzy Approach Weighted Additive Model (Fuzzy II) The weighted additive model is widely used in GP and multi-objective optimization techniques to reflect the relative importance of the goals/ objectives. In this approach, the decision maker assigns differential weights
154
KENNETH D. LAWRENCE ET AL.
as coefficients of the individual terms in the simple additive fuzzy achievement function to reflect their relative importance, i.e., the objective function is formulated by multiplying each membership of the fuzzy goal with a suitable weight and then adding them together. This leads to the following formulation, corresponding to (Hannan, 1981; Tiwari, Dharmar, & Rao, 1987): Maximize: ZðmÞ ¼
m X
w i mi ,
(4)
i¼1
subject to: mi ¼
Gi ðXÞ Li ; gi L i
AX b; X; mi 0;
mi 1; i ¼ 1; 2; . . . ; m:
RESULTS Table 1 shows the results of our models. Several observations can be made based on the above results. First, combining different forecasting methods helps improve the level of one or more objectives without worsening the level of any other objective. Second, using similar weights as in MOLP, the preemptive fuzzy approach give remarkably improved results.
CONCLUSIONS In this chapter, we compare the forecast provided by combining the traditional forecasting with a fuzzy approach using soft constraints and preemptive additive weights. The methodology used for combining the forecasts is preemptive MOLP. The main advantage of this approach is allowing the decision maker to easily determine a fuzzy relative importance for each goal with respect to the other goals. The fuzzy weighted additive criterion is also presented. We found that both the fuzzy approaches provide remarkable improvements over the MOLP technique. Within the fuzzy
W2
W3
Z1
Z2
232.8
123.0
0.6
0.0
0.3
108.2
108.2
W1
414.8
Z3
Fuzzy I
Total forecast Positive error forecast error
Z2
Weight Values
Results.
Total Positive Recent Exponential Harmonic Multiple forecast forecast forecast smoothing smoothing regression error error error
Z1
Preemptive MOLP
Table 1.
44.0
Recent forecast error
Z3
Total combined forecast error 2.0
Z
Fuzzy II
Forecasting Using Fuzzy Multiple Objective Linear Programming 155
156
KENNETH D. LAWRENCE ET AL.
approaches, the preemptive additive weighted approach provides much lower forecast errors.
REFERENCES Bates, J. M., & Granger, C. W. J. (1969). The combination of forecasts. Operational Research Quarterly, 20, 451–468. Chen, L. H., & Tsai, F. C. (2001). Fuzzy goal programming with different importance and priorities. European Journal of Operational Research, 133, 548–556. Conroy, R., & Harris, R. (1987). Consensus forecasts of corporate earnings: Analyst’s forecasts and time series methods. Management Science, 33, 725–738. Economakos, E. (1979). Application of fuzzy concepts to power demand forecasting. IEEE Transactions on Systems, Man and Cybernetics, 9(10), 651–657. Fang, S., Hu, C., Wang, H., & Wu, S. (1999). Linear programming with fuzzy coefficients in constraints. Computers and Mathematics with Applications, 37, 63–76. Hannan, E. L. (1981). On fuzzy goal programming. Decision Sciences, 12, 522–531. Hannan, E. L. (1982). Contrasting fuzzy goal programming and fuzzy multicriteria programming. Decision Sciences, 13, 337–339. Hibon, M., & Evgeniou, T. (2005). To combine or not to combine: Selecting among forecasts and their combinations. International Journal of Forecasting, 21(1), 15–24. Huang, H., & Lee, T. (1989). To combine forecasts or to combine information? No 200806, Working Papers from University of California at Riverside, Department of Economics, 2007. Ignizio, J. P. (1982a). Linear programming in single and multiple objective systems. New Jersey: Prentice-Hall. Ignizio, J. P. (1982b). On the rediscovery of fuzzy goal programming. Decision Sciences, 13, 331–336. Reeves, G. R., & Lawrence, K. D. (1982). Combining multiple forecasts given multiple objectives. Journal of Forecasting, 1, 271–279. Shnaider, E., & Kandel, A. (1989). The use of fuzzy set theory for forecasting corporate tax revenues. Fuzzy Sets and Systems, 31(2), 187–204. Song, Q., & Chissom, B. (1993). Fuzzy time series and its models. Fuzzy Sets and Systems, 54(3), 269–277. Tiwari, R. N., Dharmar, S., & Rao, J. R. (1987). Fuzzy goal programming – An additive model. Fuzzy Sets and Systems, 24, 27–34. Witt, S. F., & Song, H. (2002). Forecasting tourism flows. In: A. Lockwood & S. Medlik (Eds), Tourism and hospitality in the 21st Century (pp. 106–118). Oxford England: Butterworth Heinemann. Wong, K., Song, H., Witt, S., & Wua, D. (2007). Tourism forecasting: To combine or not to combine? Tourism Management, 28(4), 1068–1078. Zimmermann, H. J. (1978). Fuzzy programming and linear programming with several objective functions. Fuzzy Sets and Systems, 1, 45–55. Zimmermann, H. J. (1983). Fuzzy mathematical programming. Computers and Operations Research, 10, 291–298.
A DETERMINISTIC APPROACH TO SMALL DATA SET PARTITIONING FOR NEURAL NETWORKS Gregory E. Smith and Cliff T. Ragsdale ABSTRACT Several prominent data-mining studies have evaluated the performance of neural networks (NNs) against traditional statistical methods on the two-group classification problem in discriminant analysis. Although NNs often outperform traditional statistical methods, their performance can be hindered because of failings in the use of training data. This problem is particularly acute when using NNs on smaller data sets. A heuristic is presented that utilizes Mahalanobis distance measures (MDM) to deterministically partition training data so that the resulting NN models are less prone to overfitting. We show this heuristic produces classification results that are more accurate, on average, than traditional NNs and MDM.
INTRODUCTION The classification problem in discriminant analysis involves identifying a function that accurately identifies observations as originating from one of the two or more mutually exclusive groups. This problem represents a Advances in Business and Management Forecasting, Volume 7, 157–170 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007014
157
158
GREGORY E. SMITH AND CLIFF T. RAGSDALE
fundamental challenge for data miners. A number of studies have shown that neural networks (NNs) can be successfully applied to the classification problem (Archer & Wang, 1993; Glorfeld & Hardgrave, 1996; Markham & Ragsdale, 1995; Patuwo et al., 1993; Piramuthu et al., 1994; Salchenberger et al., 1992; Smith & Gupta, 2000; Tam & Kiang, 1992; Wilson & Sharda, 1994; Zobel et al., 2006). However, as researchers push to improve predictive accuracy by addressing shortcomings in the NN model building process (e.g., selection of network architectures, training algorithms, stopping rules, etc.), a fundamental issue in how data are used to build NN models has largely being ignored – data partitioning. To create an NN model for a classification problem, we require a sample of data consisting of a set of observations of the form Yi, X1, X2, y, Xk, where Xj represents measured values on various independent variables and Yi a dependent variable coded to represent the group membership for observation i. These data are often referred to as training data as they are used to teach the NN to distinguish between observations originating from the different groups represented by the dependent variable. Although one generally wishes to create an NN that can predict the group memberships of the training data with reasonable accuracy, the ultimate objective is for the NN to generalize or accurately predict group memberships of new data that were not present in the training data and whose true group membership are not known. The ability of an NN to generalize depends greatly on the adequacy and use of its training data Burke and Ignizio (1992). Typically, the training data presented to an NN are randomly partitioned into two samples: one for calibrating (or adjusting the weights in) the NN model and one for periodically testing the accuracy of the NN during the calibration process. For this hold-out form of cross-validation, the testing sample is used to prevent overfitting, which occurs if an NN begins to model sample-specific characteristics of the training data that are not representative of the population from which the data were drawn. Overfitting reduces the generalizability of an NN model and, as a result, is a major concern in NN model building. Several potential problems arise when NN training data are randomly partitioned into calibration and testing samples. First, the data randomly assigned to the calibration sample might be biased and not accurately represent the population from which the training data were drawn, potentially leading to a sample-specific NN model. Second, the data randomly assigned to the testing sample may not effectively distinguish between good and bad NN models. For example, in a two-group discriminant problem, suppose the randomly selected testing data from
Deterministic Approach to Small Data Set Partitioning for NNs
159
each group happens to be points that are located tightly around each of the group centroids. In this case, a large number of classification functions are likely to be highly and equally effective at classifying the testing sample. As a result, the testing data are ineffective in preventing overfitting and inhibits (rather than enhances) the generalizability of the NN model. Though the potential problems associated with random data partitioning can arise in any data set, their impact can be more acute with small data sets. This may have contributed to the widely held view that NNs applying hold-out cross-validation are most appropriately used with classification problems where a large amount of training data are available. However, if training data could be partitioned in such a way to combat the shortcomings of random partitioning the effectiveness of NNs might be enhanced, especially for smaller data sets. This chapter proposes an NN data partitioning (NNDP) heuristic using Mahalanobis distance measures (MDM) to deterministically partition training data into calibration and testing samples so as to avoid the potential weaknesses of random partitioning. Computational results are presented indicating that the use of NNDP results in NN models that outperform traditional NN models and the MDM technique on small data sets. The remainder of this chapter is organized as follows. First, the fundamental concepts of MDM and NN classification methods to solve two-group classification problems are discussed. Next, the proposed NNDP heuristic is described. Finally, the three techniques (MDM, NN, and NNDP) are applied to several two-group classification problems and the results are examined.
CLASSIFICATION METHODS Mahalanobis Distance Measures The aim of a two-group classification problem is to generate a rule for classifying observations of unknown origin into one of the two mutually exclusive groups. The formulation of such a rule requires a ‘‘training sample’’ consisting of n observations where n1 are known to belong to group 1, n2 are known to belong to group 2, and n1þn2 ¼ n. This training sample is analyzed to determine a classification rule applicable to new observations whose true group memberships are not known. A very general yet effective statistical procedure for developing classification rules is the MDM technique. This technique attempts to classify a
160
GREGORY E. SMITH AND CLIFF T. RAGSDALE
new observation of unknown origin into the group it is closest to based on multivariate distance measures from the observation to the estimated mean vector for each of the two groups. To be specific, suppose that each observation Xi is described by its values on k independent variables X i ¼ ðX 1i ; X 2i ; . . . ; X ki Þ. Let x kj represents the sample mean for the kth independent variable in group j. Each of the two groups will have their own j ¼ x 1j ; x 2j ; . . . ; x kj , j 2 f1; 2g. The MDM of a new centroid denoted by X observation Xnew of unknown origin to the centroid of group j is given by: j C1 X new X j 0 Dj ¼ X new X (1) where C represents the pooled covariance matrix for both groups (Manly, 2004). So to classify a new observation, the MDM approach first calculates the multivariate distance from the observation to the centroid of each of the two groups using eqn (1). This will result in two distance measures: D1 for group 1 and D2 for group 2. A new observation would be classified as belonging to the group with minimum Dj value. Under certain conditions (i.e., multivariate normality of the independent variables in each group and equal covariance matrices across groups) the MDM approach provides ‘‘optimal’’ classification results in that it minimizes the probability of misclassification. Even when these conditions are violated, the MDM approach can still be used as a heuristic (although other techniques might be more appropriate). In any event, the simplicity, generality, and intuitiveness of the MDM approach make it a very appealing technique to use on classification problems (Markham & Ragsdale, 1995).
Neural Networks Another way of solving two-group classification problems is through the application of NNs. NNs are function approximation tools that learn the relationship between independent and dependent variables. However, unlike most statistical techniques for the classification problem, NNs are inherently nonparametric and make no distributional assumptions about the data presented for learning (Smith & Gupta, 2000). An NN is composed of a number of layers of nodes linked together by weighted connections. The nodes serve as computational units that receive inputs and process them into outputs. The connections determine the information flow between nodes and can be unidirectional, where
Deterministic Approach to Small Data Set Partitioning for NNs T1
V11 V1j
Z1
W11
V1p
W12 Vi1 Ti
161
Y1
Wj1
Vij
Zj
Vip
Wj2
Wp1 Vn1
Zp
Y2
Wp2
Vnj Tn Input Units
Fig. 1.
Vnp Hidden Units
Output Units
Multi-Layered Feed-Forward Neural Network for Two-Group Classification.
information flows only forwards or only backwards, or bidirectional, where information can flow forwards and backwards (Fausett, 1994). Fig. 1 depicts a multilayered feed-forward NN (MFNN) where weighted arcs are directed from nodes in an input layer to those in an intermediate or hidden layer, and then to an output layer. The back-propagation (BP) algorithm is a widely accepted method used to train an MFNN (Archer & Wang, 1993). When training an NN with the BP algorithm, each input node T1, y, Tn receives an input value from an independent variable associated with a calibration sample observation and broadcasts this signal to each of the hidden layer nodes Z1, y, Zp. Each hidden node then computes its activation (a functional response to the inputs) and sends its signal to each output node denoted as Yk. Each output unit computes its activation to produce the response for the net for the observation in question. The BP algorithm uses supervised learning, meaning that examples of input (independent) and output (dependent) values from known origin for each of the two groups are provided to the NN. In this study, the known output value for each example is provided as a two-element binary vector where a value of zero indicates the correct group
162
GREGORY E. SMITH AND CLIFF T. RAGSDALE
membership. Errors are calculated as the difference between the known output and the NN response. These errors are propagated back through the network and drive the process of updating the weights between the layers to improve predictive accuracy. In simple terms, NNs ‘‘learn’’ as the weights are adjusted in this manner. Training begins with random weights that are adjusted iteratively as calibration observations are presented to the NN. Training continues with the objective of error minimization until stopping criteria are satisfied (Burke, 1991). To keep an NN from overfitting the calibration data, testing data are periodically presented to the network to assess the generalizability of the model under construction. The Concurrent Descent Method (CDM) (Hoptroff et al., 1991) is widely used to determine the number of times the calibration data should be presented to achieve the best performance in terms of generalization. Using the CDM, the NN is trained for an arbitrarily large number of replications, with pauses at predetermined intervals. During each pause, the NN weights are saved and tested for predictive accuracy using the testing data. The average deviation of the predicted group to the known group for each observation in the testing sample is then calculated and replications continue (Markham & Ragsdale, 1995). The calibration process stops when the average deviation on the testing data worsens (or increases). The NN model with the best performance on the testing data is then selected for classification purposes (Klimasauskas et al., 1989). Once a final NN is selected, new input observations may be presented to the network for classification. In the two-group case, the NN will produce two response values, one for each of the two groups for each new observation presented. As with the MDM classification technique, these responses could be interpreted as representing measures of group membership, when compared to the known two-group output vector, where the smaller (closer to zero) the value associated with a particular group, the greater the likelihood of the observation belonging to that group. Thus, the new observation is classified into the group corresponding to the NN output node producing the smallest response. Since NNs are capable of approximating any measurable function to any degree of accuracy, they should be able to perform at least as well as the linear MDM technique on non-normal data (Hornick et al., 1989). However, several potential weaknesses with an NN model may arise when data presented for model building are randomly partitioned into groups for testing and training. First, the randomly assigned calibration data may not be a good representation of the population from which it was drawn,
Deterministic Approach to Small Data Set Partitioning for NNs
163
potentially leading to a sample-specific model. Second, the testing data may not accurately assess the generalization ability of a model if not chosen wisely. These weaknesses, individually or together, may adversely affect predictive accuracy and lead to a nongeneralizable NN model. In both cases, the weaknesses arise because of problems with data partitioning and not from the model building process.
Deterministic Neural Network Data Partitioning As stated earlier, randomly partitioning training data can adversely impact the generalizability and accuracy of NN results. Thus, we investigate the effect of using a deterministic pre-processing technique on training data to improve results and combat the potential shortcomings of random data selection. The intention of this effort is to deterministically select testing and calibration samples for an NN to limit overfitting and improve classification accuracy in two-group classification problems for small data sets. We introduce an NNDP heuristic that utilizes MDM as the basis for selecting testing and calibration data. In the NNDP heuristic, MDM is used to calculate distance measures for each observation presented for training to both group centroids. These two distance values represent: (1) the distance from each observation to its own group centroid and (2) the distance from each observation to the opposite group centroid. A predetermined number of observations having the smallest distance measures to the opposite group centroid are selected as the testing sample. These observations are those most apt to fall in the region where the groups overlap. Observations in the overlap region are the most difficult to classify correctly. Hence, this area is precisely where the network’s classification performance is most critical. Selecting testing data in this manner avoids the undesirable situation where no testing data falls in the overlap region, which might occur with random data partitioning (e.g., if the randomly selected testing data happens to fall tightly around the group centroids). The training observations not assigned to the testing sample constitute the calibration sample. They represent values with the largest distance measures to the opposite group’s centroid and therefore are most dissimilar to opposite group and most representative of their own group. We conjecture that the NNDP heuristic will decrease overfitting and increase predictive accuracy for two-group classification problems in discriminant analysis.
164
GREGORY E. SMITH AND CLIFF T. RAGSDALE
METHODOLOGY From the previous discussion, three different methods for solving the twogroup problem were identified for computational testing: MDM – standard statistical classification using Mahalanobis distance measures, NN – neural network classification using random testing and training data selection, NNDP – neural network classification using the NNDP heuristic to deterministically selected testing and calibration data. The predictive accuracy of each technique will be assessed using two bank failure prediction data sets which are summarized in Table 1. The data sets were selected because they offer interesting contrasts in the number of observations and number of independent variables. For experimental testing purposes, each data set is randomly divided into two samples, one for training and one for validation of the model (Fig. 2). The training data will be used with each of the three solution methodologies for model building purposes. Although the NN techniques partition the training data into two samples (calibration and testing) for model building purposes the MDM technique uses all the training data with no intermediate model testing. The validation data represent ‘‘new’’ Table 1.
Summary of Bankruptcy Data Sets.
Number of observations Bankrupt firms Nonbankrupt firms Number of variables a
Texas Banka
Moody’s Industrialb
162 81 81
46 21 25
19
4
Tam and Kiang (1992). Johnson and Wichern (2007).
b
Training Data Testing Sample
Calibration Sample
Hold-Out Sample Validation Data
Fig. 2.
Experimental Use of Sample Data.
165
Deterministic Approach to Small Data Set Partitioning for NNs
observations to be presented to each of the three modeling techniques for classification; allowing the predictive accuracy of the various techniques to be assessed on observations that had no role in developing the respective classification functions. Thus, the validation data provides a good test for how well the classification techniques might perform when used on observations encountered in the future whose true group memberships are unknown. To assess the effect of training sample size on the various classification techniques, three different training and validation sample sizes were used for each data set. Table 2 represents the randomly split data sizes in the study by data set, trial, and sample. The data assigned to training are balanced with an equal number of successes and failures. All observations assigned to the training sample are used in the MDM technique for model building. The NN technique uses the same training sample as the MDM technique, but randomly splits the training data into testing and calibration samples in a 50/50 split, with an equal assignment of successes and failures in each sample. The NNDP technique also uses the same training sample as the MDM and NN techniques, but uses our deterministic selection heuristic to choose testing and training samples. NNDP technique selects the half of each of the two groups that is closest to its opposite group centroid as the testing data. The remaining observations are assigned as calibration data. Again, there is a 50/50 assignment of successes and failures to both the testing and calibration samples. For each bankruptcy data set, we generated 30 different models for each of the three training/validation sample size scenarios. This results in 90 runs for each of the three solution methodologies for each data set. For each run, MDM results were generated first, followed by the NN results and NNDP results. A Microsoft EXCEL add-in was used to generate the MDM classification results as well as the distance measures used for data pre-processing for the Table 2.
Summary of Data Assignments. Texas Bank
Moody’s Industrial
Trial 1
Trial 2
Trial 3
Trial 1
Trial 2
Trial 3
Training sample Validation sample
52 110
68 94
80 82
16 30
24 22
32 14
Total
162
162
162
46
46
46
166
GREGORY E. SMITH AND CLIFF T. RAGSDALE
NNDP heuristic. The NNs used in this study were developed using NeuralWorks Predict (Klimasauskas et al., 1989). The standard backpropagation configuration was used. The NNs used sigmoidal functions for activation at nodes in the hidden and output layers and all default settings for Predict were followed throughout.
RESULTS Texas Bank Data Table 3 lists the average percentage of misclassified observations in the validation sample for each training/validation split of the Texas Bank Data. Note that the average rate of misclassification for NNDP was 16.58% as compared to the NN and MDM techniques which misclassified at 20.21% and 19.09%, respectively, using 52 observations for training (and 110 in validation). Likewise, the average rate of misclassification for NNDP, NN, and MDM techniques were 16.03%, 18.83%, and 17.30%, respectively, using 68 observations for training (and 94 in validation). Finally, we found the average rate of misclassification for NNDP, NN, and MDM to be 14.92%, 18.09%, and 16.34%, respectively, using 80 observations for training (and 82 in validation). It should be noted that, on average, the NNDP technique was more accurate than the two other techniques at all experimental levels. Also, the average misclassification rate for each technique decreased as the number of observations assigned to model building increased (or validation sample size decreased) which we would expect with increased training sample size. Table 3.
Average Percentage of Misclassification by Solution Methodology. MDM(%)
NN(%)
NNDP (%)
Validation size 110 94 82
19.09a 17.30a 16.34a
20.21 18.83 18.09
16.58a,b 16.03a,b 14.92a,b
Total
17.58
19.04
15.84
a
Indicates statistically significant differences from NN at the a ¼ .005 level. Indicates statistically significant differences from MDM at the a ¼ .005 level.
b
167
Deterministic Approach to Small Data Set Partitioning for NNs
Table 4 lists the number of times each technique produced the fewest misclassifications in each of 30 runs at each validation size for the Texas Bank Data. Although NNDP did not always produce the fewest number of misclassifications, it ‘‘won’’ significantly more times than the other two methods. Several cases exist where the MDM and/or NN outperformed the NNDP; these results are to be expected as NN are heuristic search techniques that may not always provide global optimal solutions. Moody’s Industrial Data Table 5 lists the average percentage of misclassified observations in the validation sample for each training/validation split of the Moody’s Industrial Data. We see that the average rate of misclassification for NNDP was 16.00% as compared to the NN and MDM techniques which both misclassified at 19.11% using 16 observations for training (and 30 in validation). Table 4.
Number of Times Each Methodology Produced the Fewest Misclassifications. MDM
NN
NNDP
Validation size 110 94 82
8 12 7
5 9 9
20 15 18
Total
27
23
53
In the event of a tie, each tied technique was given credit for having the fewest misclassifications. Therefore, the total number for each validation size may be greater than 30.
Table 5.
Average Percentage of Misclassification by Solution Methodology. MDM (%)
NN (%)
NNDP (%)
Validation size 30 22 14
19.11 20.00a 17.14a
19.11 19.09 18.81
16.00a,b 17.12a,b 13.31a,b
Total
18.75
19.00
15.48
a
Indicates statistically significant differences from NN at the a ¼ .005 level. Indicates statistically significant differences from MDM at the a ¼ .005 level.
b
168
Table 6.
GREGORY E. SMITH AND CLIFF T. RAGSDALE
Number of Times Each Methodology Produced the Fewest Misclassifications. MDM
NN
NNDP
Validation size 30 22 14
13 12 15
9 12 10
24 17 20
Total
40
31
61
In the event of a tie, each tied technique was given credit for having the fewest misclassifications.
Therefore, the total number for each validation size may be greater than 30.
Likewise, the average rate of misclassification for NNDP, NN, and MDM techniques were 17.12%, 19.09%, and 20.00%, respectively, using 24 observations for training (and 22 in validation). Finally, we found the average rate of misclassification for NNDP, NN, and MDM to be 13.31%, 18.81%, and 17.14%, respectively, using 32 observations for training (and 14 in validation). Again, it should be noted that, on average, the NNDP technique was more accurate than the two other techniques at all experimental levels. Also, the average misclassification rate for each technique decreased as the number of observations assigned to model building increased (or validation sample size decreased) which we would expect with increased training size. Table 6 lists the number of times each technique produced the fewest misclassifications in each of 30 runs at each training sample size for the Moody’s Industrial Data. Again we observe that the NNDP did not always produce the fewest number of misclassifications. However, it ‘‘won’’ significantly more times than the other two methods. The results from both data sets show that the NNDP heuristic outperformed, on average, the MDM and NN in all cases. In addition, the NNDP reduced misclassification when compared to MDM (the more accurate of the two traditional techniques) by an average of 9.90% on the Texas Bank data and 17.44% on the Moody’s data.
IMPLICATIONS AND CONCLUSIONS Implications Several important implications to data mining arise from this research. First, the proposed NNDP heuristic holds considerable promise in
Deterministic Approach to Small Data Set Partitioning for NNs
169
eliminating the innate negative effects that random data partitioning can have on building a generalizable NN applying hold-out cross-validation. Although further testing is necessary, it appears that on small two-group data sets the NNDP technique will perform at least as well as traditional statistical techniques and standard NNs that use a random calibration and testing data assignment. This is especially significant as NNs are generally believed to be less effective or inappropriate for smaller data sets. Second, our results show the NNDP technique produces improvements over simple NNs using default settings without model adjustment or application of enhanced NN model building techniques. This result is important as, potentially, NNDP could simply be applied in addition to any model enhancements, such as those proposed in Markham and Ragsdale (1995), Sexton et al. (2003), and increase accuracy even further.
Conclusions The NNDP heuristic has been introduced that combines the data classification properties of a traditional statistical technique (MDM) with an NN to create classification models that are less prone to overfitting. By deterministically partitioning training data into calibration and testing samples, undesirable effects of random data partitioning are mitigated. Computational testing shows the NNDP heuristic outperformed both MDM and tradition NN techniques when applied to relatively small data sets. Application of the NNDP heuristic may help dispel the notion that NNs are only appropriate for classification problems with large amounts of training data. Thus, the NNDP approach holds considerable promise and warrants further investigation.
REFERENCES Archer, N. P., & Wang, S. (1993). Application of the back propagation neural network algorithm with monotonicity constraints for two-group classification problems. Decision Sciences, 24(1), 60–75. Burke, L. Ll. (1991). Introduction to artificial neural systems for pattern recognition. Computers and Operations Research, 18(2), 211–220. Burke, L. L., & Ignizio, J. P. (1992). Neural networks and operations research: An overview. Computers and Operations Research, 19(3/4), 179–189. Fausett, L. (1994). Fundamentals of neural networks: Architectures, algorithms, and applications. Upper Saddle River: Prentice Hall.
170
GREGORY E. SMITH AND CLIFF T. RAGSDALE
Glorfeld, L. W., & Hardgrave, B. C. (1996). An improved method for developing neural networks: The case of evaluating commercial loan creditworthiness. Computers and Operations Research, 23(10), 933–944. Hoptroff, R., Bramson, M., & Hall, T. (1991). Forecasting economic turning points with neural nets. IEEE INNS International Joint Conference of Neural Networks, Seattle, WA, USA, pp. 347–352. Hornick, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2, 359–366. Johnson, R. A., & Wichern, D. W. (2007). Applied multivariate statistical analysis (6th ed.). Upper Saddle River: Prentice Hall. Klimasauskas, C. C., Guiver, J. P., & Pelton, G. (1989). Neural works predict. Pittsburg: NeuralWare, Inc. Manly, B. (2004). Multivariate statistical methods: A primer (3rd ed.). London: Chapman and Hall. Markham, I. S., & Ragsdale, C. T. (1995). Combining neural networks and statistical predictions to solve the classification problem in discriminant analysis. Decision Sciences, 26, 229–242. Patuwo, E., Hu, M. Y., & Hung, M. S. (1993). Two-group classification using neural networks. Decision Sciences, 24(4), 825–845. Piramuthu, S., Shaw, M., & Gentry, J. (1994). A classification approach using multi-layered neural networks. Decision Support Systems, 11, 509–525. Salchenberger, L. M., Cinar, E. M., & Lash, N. A. (1992). Neural networks: A new tool for predicting thrift failures. Decision Sciences, 23(4), 899–916. Sexton, R. S., Sriram, R. S., & Etheridge, H. (2003). Improving decision effectiveness of artificial neural networks: A modified genetic algorithm approach. Decision Sciences, 34(3), 421–442. Smith, K. A., & Gupta, J. N. D. (2000). Neural networks in business: Techniques and applications for the operations researcher. Computers and Operations Research, 27, 1023–1044. Tam, K. Y., & Kiang, M. Y. (1992). Managerial applications of neural networks: The case of bank failure prediction. Management Science, 38(7), 926–947. Wilson, R. L., & Sharda, R. (1994). Bankruptcy prediction using neural networks. Decision Support Systems, 11, 545–557. Zobel, C., Cook, D., & Ragsdale, C. (2006). Dynamic data driven classification using boundary observations. Decision Sciences, 37(2), 247–262.
PART IV FORECASTING APPLICATIONS
FORECASTING THE 2008 U.S. PRESIDENTIAL ELECTION USING OPTIONS DATA Christopher M. Keller ABSTRACT The 2008 U.S. presidential election was of great interest nationally and internationally. Interest in the 2008 election was sufficient to drive a $2.8 million options market by a U.K.-based company INTRADE. The options in this market are priced as European style fixed return options (FRO). In 2008, the Security and Exchanges Commission approved, and both the American Stock Exchange and the Chicago Board Options Exchange began to trade FROs. Little research is available on trading in FROS because these markets are very new. This chapter uses the INTRADE options market data to construct exponential smoothing forecasts, which are then compared under a hypothetical trading strategy. The trading returns indicate that this market is relatively efficient at least in the short term but that because of the all or nothing payout structure of a FRO, there may exist small arbitrage opportunities.
Advances in Business and Management Forecasting, Volume 7, 173–182 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007015
173
174
CHRISTOPHER M. KELLER
INTRODUCTION AND BACKGROUND The INTRADE market is a speculative prediction market. Prediction markets are also known as information markets, idea futures, event derivatives, or virtual markets and exist for the purpose of making predictions (Spann & Skiera, 2003). There are many prediction markets publicly available: IOWA ELECTRONIC MARKETS – generally economic or political issues; TRADESPORTS – sporting events; SIMEXCHANGE – video games; HOLLYWOOD STOCK EXCHANGE – films and film-related people; and INTRADE – from business to politics to art and wine and even weather, but excluding sports, which is covered by BETFAIR. Raban and Geifman (2009) provide a useful wiki page. The promise and potential of prediction markets is also discussed in Arrow et al. (2008) and Wolfers and Zitzewitz (2004, 2008). Most evidence on prediction market efficiency compares final pre-election forecasts with actual outcomes but does not analyze the efficacy of ongoing forecasts throughout the entire market time. The results of final election market forecasts is mixed: Erikson and Wlezien (2008) claim that election prediction markets are inferior to polling estimates, whereas Jones (2008) claims that candidate futures provided the most accurate popular-vote forecasts. Lee and Moretti (2009) consider a Bayesian model of adaptive investor learning in the 2008 U.S. presidential futures market but considers only the final outcome not the state-by-state electoral results. Chen, Wang, Yang, and Yen (2009) suggest that the minimum number of market participants in a futures market may be quite small (75). In the INTRADE market, the options are priced as European style FROs (FROs). A European style FRO pays out a fixed amount at a fixed time in the future. In the case of the INTRADE market, the pay out for each option is $10, and the time is fixed by the U.S. presidential election. The option payouts are tied to whether a Republican candidate or a Democratic candidate wins the electoral votes of a particular state.
INTRADE MARKET DATA This INTRADE market was composed of 102 different assets, one for each of the two primary parties within each of 51 voting districts (including the District of Columbia), and modeled accurately the electoral process of the U.S. presidential election. The INTRADE market is a more accurate election model than the perhaps more well-known Iowa Electronic Market
Forecasting the 2008 U.S. Presidential Election Using Options Data
Fig. 1.
175
Daily Closing Prices for Pennsylvania.
since the Iowa market is modeled on total popular vote that is not the method used in the actual election of the U.S. president, and the Iowa Electronic Markets are limited to positions of only $500. This chapter analyzes all daily trading data and is composed of more than 5,400 individual daily trades over the 721 calendar days after the first trade. The data show variable forecast trends within the trading period. The state average number of Democratic trading days is 57.8 and of Republican trading days in 50.4. The state average of Democratic trading volume is $30,284 and of Republican trading volume is $26,035. Fig. 1 shows an example of the daily closing prices for the state of Pennsylvania, which was the most actively traded Democratic option.
FORECASTING METHOD This chapter uses Holt–Winters method for forecasting as specified below (Winters, 1960). Let pi represent the option price on trading day i. With the standard parameters a, b 2 ð0; 1Þ, for iW1, recursively calculate the level Li ¼ api þ (1 a) (Li 1 þ Ti 1), Ti ¼ b(Li Li 1) þ (1 b)Ti 1 from the arbitrary initial specifications that L1 ¼ p1 and T1 ¼ p2 p1. For any forecast period k, then a forecast can be generated as Ft þ k ¼ Lt þ kTt. Three notes are specified regarding this chapter’s implementation of the
176
CHRISTOPHER M. KELLER
general method. One, since the initial conditions may be variously specified, this chapter chose to use the least extensive data for initiating the process. The initial level estimate is simply the first data point observed. The initial estimate of the slope is simply the first possible slope estimate from the data. Two, because the initial estimates are based on such limited data, it is necessary to allow the process to ‘‘burn-in’’ over an initial period so that the arbitrary initial estimates do not bias the overall forecasting effort. This chapter uses a burn-in period of 20 trading days. Three, the trading days in this model are not consecutive. That is, the most common applications of this forecasting method are for complete data with equal and identically spaced intervals. However, since this chapter is interested in the trading effects, the intervals here are not equal and identical calendar-wise, but are equal and identical in whether or not a trade occurs, that is trading days. In this regard, this model is reasonable since there is no necessary assumption of precluded or pent-up demand that is transacted only intermittently as in a service stock environment example like Johnston and Boylan (1996), which may use, for example, exponentially weighted moving averages (Johnston, 1993). For any forecast period k, the optimal values of a and b are determined for Republican options and for Democratic options by minimizing the total sum of squared errors across the 50 states. Table 1 shows the optimal values for a and b for four different forecast periods. The optimal value of b for all forecast periods is relatively constant at a value of about 10%. The optimal value of a however changes dramatically for the shortest forecast period of k ¼ 1 trading day from a value of about 80% to a value of about 20% for long forecast periods of k beyond 35 trading days. This relationship is more extensively illustrated in Fig. 2.
Table 1.
Optimal Values of Forecasting Parameters for Select Forecast Periods. Optimal values
k Democratic options
1 15 35 50
Republican options
a
b
a
b
0.81 0.71 0.22 0.18
0.13 0.10 0.07 0.07
0.80 0.29 0.22 0.20
0.14 0.10 0.10 0.10
Forecasting the 2008 U.S. Presidential Election Using Options Data
Fig. 2.
177
Convergence of Forecasting Parameters, a and b, as Forecast Period k Increases.
For the shortest forecast period, k ¼ 1 trading day, the optimal a value for both Democratic and for Republican options is very high at about 80%. This value can be understood to indicate that over such a short forecast horizon, the current market price is substantially the best estimate of tomorrow’s market price. In other words, the prediction market is at least relatively efficient over a very short forecast horizon. The forecast period of k ¼ 15 trading days is the point at which there is the most divergence between the optimal values for the Republican options and for the Democratic options. For forecasts of this length, the optimal value of a for the Democratic candidate remains relatively high at about 70%, again indicated relative market efficiency for this forecast period. On the contrary, the optimal value of a for the Republican candidate has dropped dramatically to a value of about 29%. This much smaller value of a can be understood to indicate that for this longer period for this candidate the error-minimizing forecast is much more stable from past values. Colloquially, the support for this candidate over this period of time has stabilized or converged to a relatively constant underlying set. Unfortunately for the Republican candidate this rapid stabilization or supportive cohesion was stabilized on an ultimately losing subset. For forecasts beyond k ¼ 35 trading days, the optimal forecasting parameters for both candidates stabilized at about the same level of about
178
Fig. 3.
CHRISTOPHER M. KELLER
Contour Plot of a and b Parameter Optimization Sensitivity for Democratic.
a ¼ 20%. As would be expected the overall error for each of the forecasting methods for each of the respective candidates increases dramatically as the forecast horizon k increases. The Sum-of-Squared-Error (SSE) minimization is most greatly affected by the value of the b parameter. Sample contour plots of the optimal solution and surrounding values are shown below for a forecast period of k ¼ 15 in Figs. 3 and 4.
TRADING EVALUATION An overall assessment of the forecasting process is this prediction market is applied by retroactively assessing a simple trading strategy and petting in prediction markets in general is discussed in Fang, Stinchcombe, and
Forecasting the 2008 U.S. Presidential Election Using Options Data
Fig. 4.
179
Contour Plot of a and b Parameter Optimization Sensitivity for Republican.
Whiston (2007). Since the options in question are all-or-nothing options, one of the simplest investment strategies is a buy-winner strategy that is a variant of buy-and-hold that uses a single trigger price as the ‘‘winner’’ of the buy-winner strategy. That is, if the forecast at any time exceeds a specified trigger price, then the entire volume of the market is purchased at the respective closing price. The results of this analysis suggest that there is pricing behavior in the market that warrants further research. Since the Democratic candidate in fact won the election, virtually any evaluation of a buy-winner strategy for Democratic options would be profitable. This chapter instead analyzes the buy-winner trading strategy as applied only to the Republican candidate and thus is an a fortiori analysis in attempting to mitigate any post hoc in-sample error problem. Trades in these stocks also incur a $0.05 cost per option to trade and a $0.10 expiry
180
CHRISTOPHER M. KELLER
cost if the option is in the money. These costs are incorporated in the trading strategy returns discussed later in text. The buy-winner strategy can be applied even with no forecast. That is, this strategy could be affected by setting a trigger price, so that anytime the market price is above this trigger price, all options in the market at that time are purchased. Basically, this strategy assumes that there is a ‘‘tipping point’’ (Gladwell, 2000), at which the current price indicates a future winner. This strategy yields a positive return on investment for every trigger price above $6.60. When the k ¼ 1 forecast is applied, the buy-winner strategy yields a positive return on investment for every trigger price above $6.80. The evaluation of the buy-winner strategy gets more interesting as the forecast period increases. At the point of maximal candidate optimal parameter difference, k ¼ 15, this strategy shown a dramatic positive return on investment for every trigger price above $8.80. For forecasts beyond k ¼ 15 periods, the buy-winner strategy shows negative returns on investment. The three positive trading return strategy models current market price, k ¼ 1 forecast, and k ¼ 15 forecast are illustrated in Fig. 5. Fig. 5 establishes that both the market price basis and the k ¼ 1 forecast basis yield positive returns in the range of 5%–10% when implemented with any trigger price above approximately $6.50. Furthermore, this rate of
Fig. 5.
Percentage Return on Investment for Buy-Winner Strategy Based on ‘‘Winner’’ Determination Using Three Different Data Sets.
Forecasting the 2008 U.S. Presidential Election Using Options Data
181
return increases as the strategy trigger price increases to approximately $8.25, and then the rate of return begins to decrease. A simple least-squares approximation is added to the graph that suggests that overall the k ¼ 1 forecast very slightly outperforms the simple market price forecast. Fig. 5 also establishes that the k ¼ 15 forecast is not profitable until the trigger price is at the relatively high value of $8.60, but thereafter the rate of return increases dramatically as the trigger price exceeds this value. It should be noted that as the trigger price increases, the volume, and hence the total profit realized, must simultaneously decrease. This again is just a cautionary note of care that the analysis is not in any way suggesting some fool’s gold of a perfect arbitrage opportunity. Rather, this analysis suggests conservatively that further research into the behavior of pricing past a ‘‘tipping point’’ for FROs is imperative. At the least, one would expect that once prices or forecast of prices exceed some fixed level, that trading at any level short of the all-or-nothing payout should virtually cease otherwise there appears to be a valuable arbitrage opportunity.
DISCUSSION AND CONCLUSION These INTRADE futures are priced as European style FRO. In 2008, the Security and Exchanges Commission approved and both the American Stock Exchange and the Chicago Board Options Exchange began to trade FROs. Little research is available on trading in FROs because the markets are very new. This chapter provides illustrative information on this new market by examining the INTRADE trading data. In particular, the results from simulated simple trading strategies on the forecasted data indicate pricing anomalies that may be both potentially profitable and inferentially generate better forecasts. Although many valid and justified qualifiers may be well applied to any simulated trading results, in any case, this chapter provides useful and informative forecasting analysis of a rich set of data with implications to a newly opened type of market.
REFERENCES Arrow, K. J., Forsythe, R., Gorham, M., Hahn, R., Hanson, R., Ledyard, J. O., Levmore, S., Litan, R., Milgrom, P., Nelson, F. D., Neumann, G. R., Ottaviani, M., Schelling, T. C., Shiller, R. J., Smith, V. L., Snowberg, E., Sunstein, C. R., Tetlock, P. C., Tetlcok, P. E., Varian, H. R., Wolfers, J., & Zitzewitz, E. (2008). The promise of prediction markets. Science, 320, 977, May 16 2008.
182
CHRISTOPHER M. KELLER
Chen, A., Wang, J., Yang, S., & Yen, D. C. (2009). The forecasting ability of Internet-based virtual futures market. Expert Systems with Applications, 36(10), 12578–12584. Erikson, R. S., & Wlezien, C. (2008). Are political markets really superior to polls as election predictors? Public Opinion Quarterly, 72(2), 190–215. Fang, F., Stinchcombe, M., & Whiston, A. (2007). Put your money where your mouth is – A betting platform for better prediction. Review of Network Economics. Gladwell, M. (2000). The tipping point. New York: Little Brown and Company. Johnston, F. R. (1993). Exponentially weighted moving average (EWMA) with irregular updating periods. Journal of the Operational Research Society, 44(7), 711–716. Johnston, F. R., & Boylan, J. E. (1996). Forecasting for items with intermittent demand. Journal of the Operational Research Society, 47, 113–121. Jones, R. J., Jr. (2008). The state of presidential election forecasting: The 2004 experience. International Journal of Forecasting, 24(2), 310–321. Lee, D. S., & Moretti, E. (2009). Bayesian learning and the pricing of new information: Evidence from prediction markets. American Economic Review: Papers and Proceedings, 99(2), 330–336. Raban and Geifman (2009). Prediction markets wiki. Available at http://pm.haifa.ac.il Spann, M., & Skiera, B. (2003). Internet-based virtual stock markets for business forecasting. Management Science, 49(10). Winters, P. R. (1960). Forecasting sales by exponentially weighted moving averages. Management Science, 6, 324–342. Wolfers, J., & Zitzewitz, E. (2004). Prediction markets. Journal of Economic Perspectives, 18(2), 107–126. Wolfers, J., & Zitzewitz, E. (2008). Prediction markets in theory and practice. In: L. Blume & S. Durlauf (Eds), The new Palgrave dictionary of economics (2nd ed.). London: Palgrave Macmillan.
RECOGNITION OF GEOMETRIC AND FREQUENCY PATTERNS FOR IMPROVING SETUP MANAGEMENT IN ELECTRONIC ASSEMBLY OPERATIONS Rolando Quintana and Mark T. Leung ABSTRACT Most setup management techniques associated with electronic assembly operations focus on component similarity in grouping boards for batch processing. These process planning techniques often minimize setup times. On the contrary, grouping with respect to component geometry and frequency has been proved to further minimize assembly time. Thus, we propose the Placement Location Metric (PLM) algorithm to recognize and measure the similarity between printed circuit board (PCB) patterns. Grouping PCBs based on the geometric and frequency patterns of components in boards will form clusters of locations and, if these clusters are common between boards, similarity among layouts can be recognized. Hence, placement time will decrease if boards are grouped together with respect to the geometric similarity because the machine head will travel less. Given these notions, this study develops a new technique to group PCBs based on the essences of both component commonality and the PLM. The proposed pattern recognition method in conjunction with the Advances in Business and Management Forecasting, Volume 7, 183–206 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007016
183
184
ROLANDO QUINTANA AND MARK T. LEUNG
Improved Group Setup (IGS) technique can be viewed as an extended enhancement to the existing Group Setup (GS) technique, which groups PCBs solely according to component similarity. Our analysis indicates that the IGS performs relatively well with respect to an array of existing setup management strategies. Experimental results also show that the IGS produces a better makespan than its counterparts over a low range of machine changeover times. These results are especially important to operations that need to manufacture quickly batches of relatively standardized products in moderate to larger volumes or in flexible cell environments. Moreover, the study provides justification to adopt different group management paradigms by electronic suppliers under a variety of processing conditions.
INTRODUCTION The electronics industry has become a highly competitive environment due to rapid growth of the global market. Adjustments on how to manufacture products have constantly occurred in the past decades. One of the salient responses to changes in the global market demand is the adaptation of management techniques that explicitly account for mixture of volume levels (or batch sizes) and the variances of products to be manufactured. These changes can affect both the commercial and the military sectors of the U.S. economy. For instance, printed circuit board (PCB) assemblers are often required to produce a wide variety of boards while maintaining a high level of quality and, simultaneously, minimizing both production time and cost. Thus both sectors must be agile, i.e., adaptable to rapid changes in the product manufactured and to generate reliable production schedules due to continuous changes in production mix to stay competitive in business. Automation in electronics assembly has also become increasingly expensive. The initial investment in pick-and-place machines can be very costly, with high operating costs. Further, component placement equipment is often the bottleneck in the PCB assembly process. Hence, the focus of this research is on the setup management and operational planning issues in electronics assembly. Specifically, we consider the operations of component placement equipments used for the assembly of surface-mount-technology (SMT) components for a PCB in a medium volume, low-to-medium variety flexible manufacturing environment.
Recognition of Geometric and Frequency Patterns
185
In this study, a group technology (GT) methodology is used to minimize the impact of both changeovers and placement times. GT is a manufacturing philosophy in which parts or processes are identified and grouped together when they are related or similar. Askin and Standrige (1993) state that GT is based on the concept that similar things should be made in a similar manner. One of the assumptions of GT is that setup time is small or insignificant when there is a changeover from one product to another. However, Leon and Peters (1995) point out that the setup penalty is not necessarily small in PCB assembly process. They mention that two different boards with the same components may require a different setup since each layout (geometric placement pattern) might be different. This shed the light to our current exploration of a new technique based on pattern recognition of PCB layouts. In order to group a set of boards into clusters (a.k.a. families) of products, numerical similarity measurements are required to mathematically (or objectively) quantify relationships among them. In the case of grouping PCBs for the purpose of assembly, variables that affect the production throughput and machine setup time are considered. These variables have been commonly identified as component type, component frequency, and assembly time. The PCBs should be grouped by the similarity exhibited between boards in terms of these parameters. Here, similarity is technically defined as correspondence or resemblance of two or more boards, and similarity coefficients can thus be seen as a measure of association among them with respect to a particular goal (e.g., minimized total production time). The motivation for this research is thus based on how similarity is determined and utilized. Grouping clusters of PCBs by the types of common components has been shown to be effective, particularly in situations where the time for changing a feeder is high. Nevertheless, it is hypothesized that, when the setup time is relatively small, the time to actually place components (assembly time) becomes more important in determining the makespan (total production time) of the batch of boards. Thus, a measure of association that provides PCB clusters that are similar with respect to the commonality of components as well as geometric arrangement (layout) of board components may lead to meaningful reduction of the makespan of production batch. In our study, we propose an algorithm to recognize the geometric and frequency patterns of the components on PCBs and then create an improved group setup management strategy that exploits geometric similarity, component frequency as well as component similarity among the PCBs to be manufactured.
186
ROLANDO QUINTANA AND MARK T. LEUNG
PROBLEM DESCRIPTION A generic pick-and-place machine used for electronic assembly usually consists of three primary elements: a table that holds the PCB, a feeder carriage that holds the components, and a head that picks up the components from the feeders and places them on a PCB. This generic component placement machine represents the type of machines used by electronics manufacturers in the El Paso, Texas/Cd. Juarez, Mexico region. An illustration of a typical SMT-PCB assembly operation is depicted in Fig. 1. Clustering is a setup management technique widely adopted in electronics assembly process planning. The technique has been examined by many studies including Ahmadi, Grotzinger, and Johnson (1988), Ammons et al. (1993), Ashayeri and Selen (2007), Askin, Dror, and Vakharia (1994), Chen and Chyu (2002, 2003), Hillier and Brandeau (1998), Leon and Peters (1995, 1998), Maimon and Shtub (1991), Maimon, Dar-El, and Carmon (1993), Moon (2010), Neammanee and Reodecha (2009), and Wu, Ji, and Ho (2009), among many others. For the sake of brevity, a comprehensive review of the literature is omitted here. Interested readers can refer to the survey article by Allahverdi, Gupta, and Aldowaisan (1999) for more guidance. Besides, Fraser, Harris, and Luong (2007) examine the issue from a practitioner’s point of view.
Fig. 1.
Typical SMT Assembly Operation.
187
Recognition of Geometric and Frequency Patterns Feeder carriage component
Board 1
Fig. 2.
??
Board 2
Illustrative Example of PCB Similarity.
Traditionally, recognition of similarity has been based on component commonality between boards (Leon & Peters, 1998). However, geometric layout and frequency of board components are two other crucial similarity measures, which may lead to poor makespan if they are ignored in PCB grouping procedure. Using only component similarity may produce board clusters that possess excessive aggregate assembly times. For example, the two boards shown in Fig. 2 would be treated as identical if only component similarity is taken into consideration. In fact, they are quite different in terms of geometry and frequency of components. Notice that one board has components clustered in the extreme left part of the board, whereas the other in the extreme right. Moreover, one board has three components, whereas the other has only two. By considering these two boards identical and thus placed them in the same group for production, the assembly time would be high since the pick-and-place head would have to travel a significant distance. Practically speaking, the total makespan of a batch of boards is the sum of all changeover and placement times as indicated by Eq. (1). In this time model, the changeover (i.e., setup) time associated with changing one feeder (i.e., each feeder carries only one type of component) is a given constant A and the placement head has a given constant velocity. N[i] denotes the number of feeder changes required to change from the (i1) to the ith board type in the same board sequence. d[i] denotes the length of the tour followed by the head to assemble a single component (board) of the ith type in the board sequence. Finally, b[i] denotes the batch size for this particular board type.
jN j X b½i Makespan ¼ AN ½i þ d ½i V i¼1
(1)
188
ROLANDO QUINTANA AND MARK T. LEUNG
Eq. (1) shows the complexity of the decisions that have to be made in setup management. It is because the number of feeder changes, the length of the tour of the head, and the board sequence are all decisions variables that are directly related to the component-feeder assignment, the placement sequence, and the board sequence, respectively. It should be noted that these decisions are not independent and must be optimized simultaneously. In addition, it is also important to point out that A reflects the efficiency of the changeover technique (human-related), V reflects the type of technology used (equipment, machines, etc.), and b[i] reflects the demand for board-type i.
GROUP SETUP STRATEGIES In this study, we utilize a variety of existing setup strategies, namely, group setup, unique setup, minimum setup and partial setup, to examine the relative efficacy of our proposed IGS and corresponding similarity measure. The group setup (GS) management technique uses clustering to group similar boards into clusters such that significant changeover penalties are incurred only between clusters. This strategy assumes that little or no changeover penalties are incurred between elements of the same family. Among the benefits of this strategy is that both setup and placement times are minimized. In contrast to discrete part processing, two different board types will be assigned to the same cluster if they use the same component types but still require different component-feeder assignments due to different geometric layouts of PCBs. As a consequence, such grouping may incur a considerable changeover penalty in PCB assembly. Another setup management technique is the unique setup, which considers one board type at a time. The component-feeder assignment and the placement sequence are specified in such way that the placement time is minimized. This is the best technique when assembling a single board type on a single machine. The minimum setup strategy, however, attempts to sequence boards and determine component-feeder assignments to minimize changeover time. This technique attempts to minimize feeder changes required to assemble a PCB, with no consideration on the effect the feeder arrangement on the placement time. The Lofgren minimum (Lofgren and McGinnis, 1986) and Lofgren centroid minimum (Leon and Peters, 1995) techniques are found to be effective when the changeover times dominate the assembly times. Finally, Leon and Peters (1995) introduce the concept of partial setups to adapt to the rapid changes in the production environment while Peters and
Recognition of Geometric and Frequency Patterns
189
Subramanian (1996) analyze partial setups for operational planning in electronic assembly. The methodology to achieve this is by considering the initial state of the system. A partial setup strategy is thus developed as an efficient adaptation to the rapid changes of the production environment. The partial setup management technique requires removal of components from the machine when changing over from one board type to the next. Removal occurs when the changeover penalty is less than the reduction in placement time. Leon and Peters (1998) later develop the Improved Partial Setup technique, which uses the group technique to construct an initial solution. A successful setup strategy is one that will minimize the complexity of the changeover for the operator, thereby minimizing costly errors. For instance, with the partial setup strategy, better-trained operators are needed because they will be required to make more decisions with respect to the componentto-slot assignments. If an operator puts a wrong component in a particular slot when partially changing the component feeders, there could be considerable losses for the company. However, with the group setup (GS) strategy, all feeder and feeder slot assignments are planned before production begins. The operator’s main responsibility is reduced to replacing a used feeder with another that is already loaded with components, and conduct a complete setup only between clusters. In the extreme, minimum setups would entail only a single setup (i.e., assuming there are enough feeders to accommodate all required components). The GS technique is the easiest to implement from an administrative perspective, whereas performing well over various feeder change time scenarios (Leon and Peters, 1995). Readers who are interested in a taxonomical review of the subject matters and methodologies can refer to Yin and Yasuda (2006). Also, Sarker (2001) reviews measures of grouping efficiency and their drawbacks. In the current study, we propose the Improved Group Setup (IGS) management strategy in which the boards are grouped into N clusters based on component and geometric (layout) similarities. An important constraint here is to limit the size of the clusters by not allowing the different number of component types per cluster to exceed the machine feeder capacity. Similar to Leon and Peters (1995), a composite board consisting of the superposition of all the placement locations of all the boards in a given cluster is formed. Then the component feeders are assigned (using the Linear Assignment Problem algorithm) and a board placement sequence scheme is developed (using the Traveling Salesman Problem technique). These board clusters, called hyperboards, are composed by the superposition of all placement locations and their corresponding components in the particular board cluster.
190
ROLANDO QUINTANA AND MARK T. LEUNG
With the new features, it is expected that the Improved Group Setup (IGS) management strategy will retain the relative simplicity of the clustering technique, whereas further minimizing makespan by reducing placement times. In our experiment, the IGS technique will be compared with the GS as well as other setup management strategies and their relative performances will be measured.
SIMILARITY OF GEOMETRIC LAYOUTS Geometric similarity is defined as a measure of association between two boards consisting of a finite number of components, with each component represented by its X-Y coordinate (location) on the board layouts. The literature suggests that this measure can be obtained by comparing the shapes of these components when connected by a straight line, or the average difference in elevation (distance) between components in respective boards that are closest (in terms of Euclidean distance) from the origin (0,0) of the board (Cronbach and Glesser, 1953; Kusiak, Vannelli, & Kumar, 1985). Fig. 3 provides a graphical example of the shape and distance measures. Note that the circles represent particular components (c) with their X-Y coordinates of a particular board j and squares represent components in board k. Shape (as shown in Fig. 3) is defined as the pattern of dips and rises across the variables, whereas elevation is the mean score over all of the variables. Nevertheless, Quintana, Leon, Zepeda, Arredondo, and Jimenez (1997) argue and show that shape measures are poor indicators of geometric similarity between two PCBs due to sensitivity to shape at the expense of the magnitude of differences between placement location scatter and elevation. The distance measure for obtaining a geometric similarity coefficient between any two boards is the Euclidean distance between the respective locations of the two boards, coupled with a frequency measure that compares the number of placement locations on the two boards. Any distance measure is actually a dissimilarity measure: the smaller the distance the higher the similarity between any two points, and vice versa. A placement location in board j is considered identical to a corresponding placement location in board k if each location is described by variables with the same magnitude. Since each location is described by an X-Y coordinate, the variable used will be the Euclidean distance of the (X,Y) point from (0,0). Therefore, the difference between the Euclidean distances of respective
Recognition of Geometric and Frequency Patterns
191
Fig. 3. Distance (Elevation) and Shape Measures.
component locations in two different boards will serve as the elevation difference between those two items. A drawback to use distance metrics as measures of association is that the estimation of similarity between boards is affected by elevation differences. In essence, points with large size differences and standard deviations can distort the effects of other points with smaller absolute sizes and standard
192
ROLANDO QUINTANA AND MARK T. LEUNG
deviations. Distance measures are also affected by transformations of the scale of measurement of the points since Euclidean distance will not preserve distance rankings (Everitt, 1980). In the next section, we will develop the Placement Location Metric (PLM) to recognize and measure the similarity between the geometric layouts of two PCBs. The metric takes into account both distance and frequency measures as described earlier.
PLACEMENT LOCATION METRIC FOR SIMILARITY OF GEOMETRIC LAYOUT Typically, PCB data provide information of component type (CT), components’ X- and Y-coordinates, and an angular orientation (which is not used in this study). The first step is to transform the X-Y coordinates into their Euclidean distance equivalents as shown in Eq. (2): pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (2) Point ¼ X 2 þ Y 2 Each location is represented by the magnitude of the point, rather than an (X,Y) coordinate. It is important to note that each dimension could be handled separately. However, it will not provide any additional information, given that the goal of this research is a relative measure of association for pattern recognition. Thus, a single measure (i.e., point) is chosen at this stage because it is easier to manipulate. Usually, a board can have hundreds of components, and a production batch may contain thousands of boards. The next step is to sort the board representation database in ascending order of point magnitude. The purpose of this re-arrangement is to allow comparison of the smallest point in board j for component m with the smallest counterpart in board k. In practice, this is handled by using the smallest number of sorted placement locations between boards j and k. The frequency factor, which will be explained later, is then used to systematically recognize and capture the degree of difference in the numbers of locations per component type between any two boards. Interestingly, this method of layout pattern recognition does not distinguish between component types but rather, simply uses locations (i.e., points) for determining similarity. The rationale behind this approach (based on the pick-and-place machines studied and generalized) is that the time it takes for the feeder to position itself, so that the robot can pick it up, is insignificant when compared to the time it takes for the pick-and-place robot to position
193
Recognition of Geometric and Frequency Patterns
20.00
20.00
20.00
10.00
10.00
10.00
0.00
0.00 0.00
10.00
20.00
0.00 0.00
10.00
(a)
Fig. 4.
20.00
0.00
(b)
10.00
20.00
(c)
Location versus Component-Type Similarity.
and orient the component on the board. Fig. 4 exemplifies this problem. If the individual component types of boards A and B are compared, it can be seen that they are closer to each other than the respective component types of board C. Yet without respect to component type (just locations), the components in boards A and C, respectively, are clustered closer to each other. Therefore, in a holistic sense, board pair A-B is geometrically more similar than A-C or B-C. Then, the distance (d) measure of dissimilarity is computed by Eq. (3) as follows: ffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n X 2 d jk ¼ Pij Pik (3) i¼1
where, Pij is the point magnitude of the Euclidean distance of the coordinate (X,Y) at the ith sorted location for board j, and n the min(nj, nk), where n is the number of locations. Afterward, the distance is normalized to provide a measure between 0 and 1, inclusively. The normalization is achieved by dividing the value of djk by the maximum djk possible given the maximum ranges of the largest board. Let Xrange and Yrange be, respectively, the Cartesian distances of the largest board. The maximum djk would be reached if the distance between each respective point in boards j and k is given by Eq. (4): qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (4) d maxjk ¼ n Xrange2 þ Yrange2 Subsequently, Eq. (5) yields the normalized distance measure (D) between boards j and k. Djk ¼
d jk ; d maxjk
where 0 Djk 1
(5)
194
ROLANDO QUINTANA AND MARK T. LEUNG
As previously noted, Djk provides a normalized measure of similarity. The normalized distance-based similarity measure is therefore (1Djk). The other factor required for identifying the similarity between PCB geometric layouts is component frequency. Essentially, frequency measures the ratio of the number of placement locations between any two boards. Suppose that boards X, Y, and Z are very similar in the clustering of their component locations. If X and Y have a similar number of components whereas board Z contains a lesser number of components than X or Y, then it is expected that boards X and Y exhibits a higher level of similarity in terms of frequency patterns. In a physical sense, the pick-and-place head would travel a more similar distance in assembling boards X and Y than that for board Z. The frequency factor (F) would thus capture the similarity of the number of placements between any two boards. As a result, we define F as the ratio of the numbers of components (NC) required for boards j and k. The mathematical expression is shown in Eq. (6): F jk ¼
minðNCj ; NCk Þ ; maxðNCj ; NCk Þ
where 0 F jk 1
(6)
When we combine the similarity measures reflecting the patterns of component distance (location) and frequency, the PLM is obtained. Therefore, the PLM for geometric layout similarity between boards j and k is defined by Eq. (7): PLMjk ¼ ð1 Djk Þ F jk
(7)
EXPERIMENTAL DESIGN In this section, we describe and explain the details of the simulation experiment used to test the proposed paradigm against a spectrum of existing setup management strategies. In summary, random boards are generated with a pre-determined degree of component and geometric association. The PCB components and placements are randomly generated to resemble industrial data obtained from four electronic assembly companies in the El Paso, Texas/Cd. Juarez, Mexico manufacturing base. The distance metric is then calculated for this batch of random boards and, via graphical and correlation analysis, the metric is evaluated. The second part of the experiment compares the IGS technique, which uses PLM, the proposed similarity measure of geometric patterns for PCB layouts, to group boards
Recognition of Geometric and Frequency Patterns
195
into production clusters. Results are directly contrasted to the GS strategy, which uses only component commonality.
Random Board Generator In order to evaluate the geometric measure of association, an algorithm is used to generate similar PCBs at random. This algorithm consists of a random number generator of values uniformly distributed between 0 and 1. A geometric similarity (G) and a component similarity (C), both in the range of [0, 1], serve as control parameters. If the value for G or C is small, the boards are different from each other. However, the closer the parameters are to one, the more similarity is exhibited between the two boards. The generic pick and place machine, illustrated in Fig. 1, has 70 feeder slots with 10 mm between slots. The board dimensions are 100 mm by 150 mm. From the board descriptions provided by the industrial partners, the placement location range per board is 30–500, whereas component types per board is 5–60, with most of these boards at the lower end of the ranges. The PCBs generated for the study have 30 component types and a maximum of 50 locations. A ‘‘seed’’ board with component locations Lo and corresponding Co is first created. Then, based on a desired G, other boards (called child-boards), are created using Eq. (8): LðiÞ ¼ LoðiÞ þ G R UNIFORMð1; 1Þ
(8)
where L(i) is the ith location and R a given range. C is determined from a desired input value (Co) in the following manner: C(i) ¼ Co(i), with probability C, and C(i) ¼ UNIFORM(1,Nc), otherwise. C(i) is the component required at the ith location and Nc the number of component types. The program generates N number of boards (N_boards) with geometric similarity G and component similarity C. In addition, each board has N number of locations (N_locations) and utilizes different component-types (Nc). Typically, the number of locations is greater than the number of different components since the same type of component can be used in several locations. For each of the boards generated, a placement location will be produced, providing a board number, X-coordinate, Y-coordinate, and type of component required. If a random placement location exceeds the physical placement parameters of the board, or if a random draw is equal to an established placement, that random draw is discarded and another is performed.
196
ROLANDO QUINTANA AND MARK T. LEUNG
A comparison between the seed-board and child-boards in terms of a predetermined geometric similarity between them is then made to test the effectiveness of the board generator by producing five pairs of boards, each with 30 components of the same type, and geometric similarities (G) of 0.5, 0.6, 0.7, 0.8, and 0.9. If the board generator is producing ‘‘good’’ boards then the difference in distance between the locations of two boards with lower similarity should be greater than that between two boards with higher similarity.
Process Planning The simulated boards represent a mixture of the different boards that are encountered at various industrial partners, including Rockwell and Thompson Consumer Electronics. These boards are used to compare the relative performances of the IGS and the GS strategies. The IGS incorporates information of both geometric layout patterns and component commonality, whereas the GT uses only component commonality. As described earlier, the IGS uses the PLM for determining similarity of PCB layouts. There are two fundamental parameters considered in the assembly process of PCBs-machine data and board data. The simulation explicitly controls these data to create a factorial experimental design. Furthermore, the information is crucial in determining the start-up conditions of the assembly system, which include the following settings:
Number of component types per board. Number of feeders. Geometric position of the feeders (X and Y coordinates). Set up time to install or remove feeders. The origin of the board in X-Y coordinates. The head velocity. The number of boards to be processed. The number of placements per board. Placement location at X-Y coordinates. Batch size. Percent of common components.
Ranges for each of these settings are listed in Table 1. Using this information, a total of 16 problem types are created by varying the ranges for each of the 16 parameters. Next, for each problem type,
197
Recognition of Geometric and Frequency Patterns
Table 1.
Parameters and Ranges for the Simulated Electronic Assembly System.
Parameters
Ranges
Number of component types per board Number of feeders Set up time to install or remove feeders The origin of the board in X-Y coordinates The head velocity The number of boards to be processed The number of placements per board Placement location at X coordinate Placement location at Y coordinate Batch size Percent of common components
Table 2. Problem Type 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
U(6,20), U(15,50) 20, 50 0, 30, 60, 90, 120 (s) (0,0) 100, 300 (mm/s) U(5,15) U(50,300) 635 þ U(0,254) 305 þ U(0,203) b1 ¼ U(10,50), b2 ¼ U(50,100) Low, 20%; high, 75%
Problem Types and Levels in Factorial Experimental Design. Batch Size
Number of Feeders
Head Velocity (mm/s)
Percentage Common
B1 B2 B1 B2 B1 B2 B1 B2 B1 B2 B1 B2 B1 B2 B1 B2
20 20 50 50 20 20 50 50 20 20 50 50 20 20 50 50
100 100 100 100 300 300 300 300 100 100 100 100 300 300 300 300
20 20 20 20 20 20 20 20 75 75 75 75 75 75 75 75
40 random problems are generated by changing the levels of the time to remove or install a feeder. This factorial design is tabulated in Table 2. Simulation of Group Setup and Assembly System After the parameters listed in the previous section have been entered, the next step is to group the boards into clusters based on the component and
198
ROLANDO QUINTANA AND MARK T. LEUNG
geometric similarity. Specifically, the following steps are performed to group the boards: 1. Merge clusters i and j by taking the elements of j and adding them to i. Cluster j will then be deleted accordingly. 2. Update of the ‘‘hyperboard.’’ In this step, component types for cluster j are moved to cluster i. The components of the remaining clusters will shift up and take the place of the merged family. 3. Collapse the similarity coefficient matrix (s[ ][ ]) at row and column for cluster j. It makes the matrix smaller since the clusters are already grouped. 4. Update the number of clusters. The clustering technique used to group similar boards into clusters is the Hierarchical Clustering Algorithm (HCA) developed by Anderberg (1973) and Ham, Hitomi, and Yoshida (1985). The number of changeovers is then calculated using Eq. (9): No: of changeovers ¼
ð2 ðNo: of familiesÞÞ 1 ðNo: of familiesÞ
(9)
This equation reflects that a changeover takes place before and after each board cluster is run, with the exception of the last cluster, which does not require a feeder changeover upon completion. The changeover time for the entire batch is then determined by Eq. (10): Change time ¼ ½No: of changeovers ðFeeder change timeÞ
(10)
The last step used in HCA determines the objective function value for the solutions obtained in previous steps. The purpose of the objective function is to get the minimum values for assembly and setup times. In former studies used to analyze group setup strategy, only component similarity is taken into consideration. Hence, the objective function only minimizes the number of changeovers. In other words, the objective function determines the minimum sum of the numbers of components for all clusters in all iterations. Since any information about geometric similarity is disregarded by using this objective function, it is important to formulate an appropriate version that takes board geometrical pattern into consideration. As a result, this new objective function also minimizes the distance traveled by the head and thus the total assembly time.
199
Recognition of Geometric and Frequency Patterns
Essentially, the similarity between two boards i and j is given by Eq. (11): S ij ¼
W C SC þ W G SG WC þ WG
(11)
where SG is the geometric similarity and SC the component similarity. The relative weights of each measure (WG and WC, respectively) are taken to be 1 in this study. We will discuss later the determination of these weights as an important area of future research. Sij measures both the degree of component type and geometric similarity between boards. Thus, the higher the sum of the similarities between all boards, the better the cluster formation with respect to both setup and placement time. Number of changeovers directly affects setup time while placement time is directly affected by placement distance (the distance the machine head must travel to place components). As such, a cluster of boards can be considered a permutation of any ordered sequence of k boards taken from a set of n distinct boards without replacement (clusters are unique). Since the value of each measure is (x90 p x p 1), the sum of the values for any particular cluster with k boards is (kx90 p kx p 1). The maximum sum for any cluster i is max{(kx)i}. This upper bound for the similarity measure is the minimum bound for the sum of setup and placement time since the latter is the converse of the similarity measure. Therefore, the number of changeovers (affected by component commonality) and the placement distance (affected by geometric commonality) will be at a minimum for the assembly of all board clusters when the sum of the similarity measures per hyperboard is maximized. The objective function is given by Eq. (12): max z ¼
k X Sij
(12)
i;j¼1
Once a maximum similarity is obtained, the minimum placement distance for the board clusters is calculated using the Traveler Salesman Problem (TSP). The placement time is then calculated by Eq. (13): Placement time ¼
No:ofX families i¼1
Placement distancei Machine head velocity
(13)
Finally, the total time for the assembly operation or makespan is determined by adding the changeover time and the placement time, as given in Eq. (1).
200
ROLANDO QUINTANA AND MARK T. LEUNG
Read Procedure Parameters
Start
Compute Objective Function and Evaluate - sum of intra family similarities
Is Objective Function Value Max?
Yes
A
Cluster Boards
Calculate Changeover (Setup) Time
Calculate Number of Changeovers
Calculate Placement Distance for Families Using TSP
Calculate Placement Time
No A Finish
Fig. 5.
Calculate Total Time
Schematic Flowchart and Overview of Simulation Experiment.
For the sake of brevity, a schematic flowchart of the simulation is delineated in Fig. 5 and thus the details are not repeated here in narration.
RESULTS Table 3 reports the simulation results for each setup strategy. It can be seen that no single setup strategy dominates all five performance measures. Also, a particular strategy which performs well in certain measures may be lacking in other measures. However, the primary purposes of our current study are to develop a more reliable setup strategy to minimize the total makespan and to examine the efficacy of using geometric pattern recognition in cluster formation. Hence, if we draw our conclusion based on these basic tenets, the proposed IGS strategy yields the best performance in terms of the mean total production time (i.e., makespan) when the feeder change time is below 60 s. This finding confirms the conjecture we have stated in previous
201
Recognition of Geometric and Frequency Patterns
Table 3.
Simulation Results of Electronic Assembly Operations.
Measure Setup strategy
Feeder Change Time (in s) 1
Mean number of feeder changes Group 155.25 Improved group 207.60 Unique 429.43 Lofgren 39.38 Lofgren-centroid 39.38 Partial 488.61 Improved partial 485.92
30
155.25 207.60 429.43 39.38 39.38 312.17 301.54
Mean total distance traveled by placement head (in Group 73285762.6 73285762.4 Improved group 70647378.0 70647378.0 Unique 70797491.0 70797490.9 Lofgren 83508898.7 83508898.5 Lofgren-centroid 75615680.3 75615680.1 Partial 70797106.7 71114463.6 Improved partial 70797491.0 71135762.8
60
155.25 182.24 429.43 39.38 39.38 241.62 228.04 mm) 73285762.6 72376539.0 70797491.0 83508898.7 75615680.3 71493959.4 71557083.3
90
155.25 197.01 429.43 39.38 39.38 201.16 186.56 73285762.6 71930047.5 70797491.0 83508898.7 75615680.3 71836824.9 71926120.8
120
155.25 202.73 429.43 39.38 39.38 153.55 138.73 73285762.6 71246098.5 70797491.0 83508898.7 75615680.3 72156647.3 72295763.7
Mean feeder changeover time (in s) Group 279.30 Improved group 390.62 Unique 772.55 Lofgren 70.84 Lofgren-centroid 70.84 Partial 879.02 Improved partial 874.19
8378.95 11718.67 23176.46 2125.27 2125.27 16847.93 16274.35
16757.89 20202.25 46352.92 4250.54 4250.54 26080.43 24615.39
25136.84 33071.78 69529.37 6375.82 6375.82 32569.43 30206.03
33515.78 45596.35 92705.83 8501.09 8501.09 33149.36 29949.24
Mean placement time (in s) Group 195428.70 Improved group 188393.01 Unique 188793.31 Lofgren 222690.40 Lofgren-centroid 201641.81 Partial 188792.28 Improved partial 188793.31
204868.26 197402.28 197821.73 233339.83 211284.67 198707.41 198766.93
195428.70 193961.93 189730.24 223795.54 202642.50 191596.70 191765.87
195428.70 191814.93 188794.76 222692.10 201643.36 191566.33 191804.46
195428.70 190388.74 189189.94 223158.24 202065.44 192821.97 193193.73
Mean total production time (in s) Group 195708.00 Improved group 188783.63 Unique 189565.86 Lofgren 222761.24 Lofgren-centroid 201712.66 Partial 189671.31 Improved partial 189667.49
213247.20 209120.95 220998.19 235465.11 213409.94 215555.34 215041.28
212186.59 214164.18 236083.15 228046.09 206893.05 217677.13 216381.26
220565.54 224886.71 258324.13 229067.92 208019.18 224135.77 222010.48
228944.48 235985.09 281895.77 231659.33 210566.53 225971.33 223142.97
202
ROLANDO QUINTANA AND MARK T. LEUNG
sections. When the feeder change time is 60 s or longer, the dominant setup technique is Lofgren-centroid. In short, IGS performs well up to a certain threshold in feeder changeovers. To further explore this issue, the total makespan of each of the group setup strategies is plotted against the feeder change time and these performance sketches are depicted in Fig. 6. In a practical sense, the IGS improves placement distance whereas the GS is pretty consistent with respect to setup time. As aforementioned, there will be a trade-off in feeder change time between the two strategies. At slower feeder change rates, the number of changeovers impacts total time to a lesser extent, and placement distance has a greater impact on production time. Fig. 6 depicts this tradeoff between the two techniques with respect to total makespan. At feeder change times below 50.28 s, the IGS outperforms the GS, while above this point the converse is true. As shown in both Table 3 and Fig. 6, the Lofgren-centroid setup strategy dominates its counterparts with respect to total makespan when the feeder change time is 60 s or above. This observation is possibly due to the Lofgren-centroid’s more flexible capability of adapting to different production conditions. Unfortunately, this adaptive capability is the reason why the Lofgren-centroid setup scheme is more difficult to manage. Specifically, it requires more complex decisions by the operator. This is also why the GS strategy has been used as the basis for the setup management technique developed in this study. The IGS performs relatively ‘‘well’’ versus the other setup management strategies and, coupled with its administrative and operational simplicity, represents a viable and efficient option for the electronics assembly systems especially those utilizing moderate to high volume flexible cells. From an operational point of view, the proposed IGS should realize its best effectiveness when the ratio of batch volume to number of part types is relatively high. The primary hypothesis behind this study is that recognition and incorporation of similarity of geometric pattern should improve placement distance when compared to the grouping scheme based only on component similarity for specific feeder change time scenarios. In other words, the total makespan of the assembly operation should be reduced. The underlying rationale is that a board cluster with components ‘‘closer’’ to each other shortens the distance of the head traveling for that particular cluster. Nevertheless, the adoption of geometric similarity in setup management may ‘‘degenerate’’ the component similarity of board clusters and possibly create more groups. As a result, the number of setups and the setup time increase. It is expected that at lower feeder change times, the improvement
Recognition of Geometric and Frequency Patterns
Fig. 6.
203
Total Production Times (Makespans) for Different Setup Strategies.
204
ROLANDO QUINTANA AND MARK T. LEUNG
in placement distance will overcome this problem. For example, say that the component and geometric similarity measure between two boards is 0.7 and 0.2, respectively. In the IGS technique, the measure of association would be computed as (0.7 þ 0.2)/2 ¼ 0.45, a relatively weak association, thus causing these boards to probably not be associated into one family, whereas they probably would be if only component similarity was used for grouping boards. Situations like these should cause more board clusters to be created, based on both measures, weakening the board family association based on component similarity. Conversely, the resultant board family from fusing these measures of association weakens the geometric bond. A tradeoff is therefore expected, with feeder change time being the key parameter.
CONCLUSIONS It is widely accepted that clustering is an effective process planning technique for batch processing in electronic assembly operations because setup and changeover times can be reduced. The fundamental premise underlying in this research is that clustering with respect to similarity of geometric layouts and frequencies of the board components, in addition to their commonality, should further minimize assembly time, and hence, total production time (makespan). In this study, the PLM methodology is developed to recognize and calibrate the geometric similarity between two boards. Consequently, component commonality and the PLM are integrated into a single pattern similarity measure, which is then used to group PC boards into clusters for production. This new setup management strategy is called the IGS technique. Based on information and data from an actual manufacturing environment, a simulation experiment capturing different setup strategies and randomized board layout patterns is conducted. As a result, efficacy of the proposed IGS can be benchmarked against an array of the existing setup management strategies, especially the GS technique which considers only component commonality in forming board clusters. The experimental outcomes suggest that the IGS performs relatively well and efficient with respect to total makespan when the changeover times are moderate to low. Given its administrative and operational simplicity, the IGS appears to be a viable methodology for the electronics assembly industry. At feeder change times of less than 30 s, the IGS can lower the makespan of the simulated PCBs (relative to the GS) by less than 2 h, or approximately 14 of a typical 8-h production shift. Thus, an electronics manufacturer can see a significant
Recognition of Geometric and Frequency Patterns
205
decrease in production time per board. This can further contribute to their responsiveness to changes in customer orders and provide better services. Future work still remains, particularly in assignment of weights to geometric and component similarity. This type of decision entails the compromise among a number of often conflicting decision criteria, such as board criteria (geometric and component commonality), equipment criteria (machines, fixtures, and tooling), setup and loading criteria, personnel assignment criteria, product mix criteria, and schedule requirement criteria. Typically, the relative weights between board geometry and component commonality may vary depending on the specific problem under consideration. The IGS must thus become an interactive tool wherein the user (or management) can provide direction for solution searching.
REFERENCES Ahmadi, J., Grotzinger, S., & Johnson, D. (1988). Component allocation and partitioning for a dual delivery placement machine. Operations Research, 36, 176–191. Allahverdi, A., Gupta, J. N. D., & Aldowaisan, T. (1999). A review of scheduling research involving setup considerations. Omega, 27, 219–239. Ammons, J. C., Carlyle, W. M., DePuy, G. W., Ellis, K. P., McGinnis, L. F., Tovey, C. A., & Xu, H. (1993). Computer-aided process planning in printed circuit card assembly. IEEE Transactions on Components, Hybrids and Manufacturing Technology, 16, 370–376. Anderberg, M. R. (1973). Cluster Analysis for Applications. New York: Academic Press. Ashayeri, J., & Selen, W. (2007). A planning and scheduling model for onsertion in printed circuit board assembly. European Journal of Operational Research, 183, 909–925. Askin, R. G., Dror, M., & Vakharia, A. J. (1994). Printed circuit board family grouping and component allocation for a multi-machine, open shop assembly cell. Naval Research Logistics, 41, 587–608. Askin, R. G., & Standrige, C. R. (1993). Modeling and Analysis of Manufacturing System. New York: Wiley. Chen, W. S., & Chyu, C. C. (2002). Reduction of printed circuit board group assembly time through the consideration of efficiency loss of placement time. Assembly Automation, 22, 360–370. Chen, W. S., & Chyu, C. C. (2003). A minimum setup strategy for sequencing PCBs with multislot feeders. Integrated Manufacturing Systems, 14, 255–267. Cronbach, L., & Glesser, G. (1953). Assessing similarity between profiles. Psychological Bulletin, 50, 456–473. Everitt, B. (1980). Cluster Analysis. New York: Halsted Publishers. Fraser, K., Harris, H., & Luong, L. (2007). Improving the implementation effectiveness of cellular manufacturing: A comprehensive framework for practitioners. International Journal of Production Research, 45, 5835–5856. Ham, I., Hitomi, K., & Yoshida, T. (1985). Group technology-applications to production management. London: Kluwer-Nijhoff Publishing.
206
ROLANDO QUINTANA AND MARK T. LEUNG
Hillier, M. S., & Brandeau, M. L. (1998). Optimal component assignment and board grouping in printed circuit board manufacturing. Operations Research, 46, 675–689. Kusiak, A., Vannelli, A., & Kumar, K. R. (1985). Cluster analysis models and algorithms. Control and Cybernetics, 15, 139–154. Leon, V. J., & Peters, B. A. (1995). Re-planning and analysis of partial setup strategies in printed circuit board assembly systems. International Journal of Flexible Manufacturing Systems, 8, 389–412. Leon, V. J., & Peters, B. A. (1998). Comparison of setup strategies for printed circuit board assembly. Computers and Industrial Engineering, 34, 219–234. Lofgren, C. B., & McGinnis, L. F. (1986). Dynamic scheduling for flexible printed circuit card assembly. Proceedings of the IEEE Systems, Man, and Cybernetics. Atlanta, Georgia. Maimon, O., Dar-El, E., & Carmon, T. F. (1993). Setup saving schemes for printed circuit board assembly. European Journal of Operational Research, 70, 177–190. Maimon, O., & Shtub, A. (1991). Grouping methods for printed circuit board assembly. International Journal of Production Research, 29, 1379–1390. Moon, G. (2010). Efficient operation methods for a component placement machine using the patterns on printed circuit boards. International Journal of Production Research, 48, 3015–3028. Neammanee, P., & Reodecha, M. (2009). A memetic algorithm-based heuristic for a scheduling problem in printed circuit board assembly. Computers and Industrial Engineering, 56, 294–305. Peters, B. A., & Subramanian, G. S. (1996). Analysis of partial setup strategies for solving the operational planning problem in parallel machine electronic assembly systems. International Journal of Production Research, 34, 999–1021. Quintana, R., Leon, V. J., Zepeda, R., Arredondo, R., & Jimenez, J. (1997). Similarity measures for PC board grouping. Industrial Engineering Research Conference Proceedings. Miami, Florida. Sarker, B. R. (2001). Measures of grouping efficiency in cellular manufacturing systems. European Journal of Operational Research, 130, 588–611. Wu, Y., Ji, P., & Ho, W. (2009). Optimizing PCB assembly for family setup strategy. Assembly Automation, 29, 61–67. Yin, Y., & Yasuda, K. (2006). Similarity coefficient methods applied to the cell formation problem: A taxonomy and review. International Journal of Production Economics, 101, 329–352.
USING DIGITAL MEDIA TO MONITOR AND FORECAST A FIRM’S PUBLIC IMAGE Daniel E. O’Leary ABSTRACT The purpose of this chapter is to investigate the notions of ‘‘Public Image Monitoring and Forecasting’’ done using media available in digital formats, such as blogs, discussion groups, and news articles, referred to in the aggregate as ‘‘digital scuttlebutt.’’ This chapter analyzes the purposes behind development of such a system and the different kinds of information that such a system would draw from. The chapter also investigates construction and extensions of a public image monitoring system designed to troll through various digital media to better understand a firm’s public image.
INTRODUCTION In late 2005, IBM (2005) announced the development of a system (‘‘Public Image Monitoring Solution’’) that was designed to search through a range of different digital media including news stories, blogs, consumer review sites, and other digitally available material to find information about a particular company that related to that company’s ‘‘public image.’’ Advances in Business and Management Forecasting, Volume 7, 207–219 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007017
207
208
DANIEL E. O’LEARY
Armed with this information, executives would be in a position to respond to the comments and information appearing in the digital material and/or leverage the information in response to the commentaries. As an example, an IBM executive discussed how the system could be used at an automotive company to search through news stories, news groups, and blogs as a means of attempting to gauge the public’s sentiment about the manufacturer (Anonymous, 2005). Information about different topics such as fuel efficiency could be gathered to determine whether public opinion was positive or negative. However, beyond that, the extent to which positive or negative opinion changed over time in response to different events also could be analyzed and used as a means of ‘‘forecasting’’ public image. Such information could then be used by different people in the firm to do either ‘‘damage control’’ or leverage the information in marketing or other types of activities. Unfortunately, there is limited description of this system and extensions of this particular system (such as forecasting) available in the literature. As a result, the purpose of this chapter is to examine why firms would want such a system, investigate the information on which such a system might be based, understand how this system might work, investigate going beyond just monitoring, to forecasting, and establish what might be reasonable expectations about the findings of such a system.
What is ‘‘Public Image?’’ The Macmillan dictionary defines public image as ‘‘the ideas and opinions that the public has about a person or an organization that may not be what they are really like.’’ Since public image refers to the ‘‘public,’’ this suggests that information must come from a broad base of sources, and not any single source, since the notion of the ‘‘public’’ is broad-based. In addition, if we are to access public opinion, then we need to access ‘‘ideas’’ and ‘‘opinions’’ about the specific entity (e.g., company) that we are concerned with. Furthermore, the definition of public opinion also states that the public image may not be what the entity is really like. As a result, a public image is something that is to be monitored, so that the entity can keep track of what is thought of it, since there apparently is no direct tie to any intrinsic true image. Finally, since public opinion is based on ideas and opinions, the entity’s ‘‘public image’’ can change over time, and thus firms must monitor that change.
Using Digital Media to Monitor and Forecast a Firm’s Public Image
209
Approach Used in this Chapter This chapter is, in part, a case study about IBM’s public image monitoring solution (e.g., IBM, 2005). However, there apparently is not much information about the system in the literature. Since little is actually in the literature that directly addresses that system, this chapter speculates about how such a system could be used and constructed, and how it could be extended, for example, beyond monitoring to include forecasting and other issues. This Chapter Accordingly, this chapter proceeds in the following manner. The first section has summarized the problem addressed in the chapter, defined ‘‘public image,’’ discussed an example of how monitoring and forecasting public image may be employed and reviewed the approach that the chapter will take. The second section investigates why firms would be interested in public image monitoring and forecasting. The third section analyzes why firms might be particularly interested in monitoring digital media such as a blog, etc. The fourth section examines other characteristics of information that needs to be monitored, including whether the information used is from inside or outside sources. The fifth section analyzes approaches to gather sentiment that likely captures the notion of ‘‘public image.’’ In addition, that section also investigates tracking that sentiment and forecasting forward with that sentiment about public image. The penultimate section provides a brief analysis of expectations of the system. The final section briefly summarizes the chapter and discusses some other extensions.
WHY WOULD FIRMS BE INTERESTED IN MONITORING AND FORECASTING PUBLIC IMAGE Understanding what others are saying about a company, in real time, can provide value-creating opportunities. Accordingly, the purpose of this section is to examine a number of different reasons why firms might be interested in monitoring and forecasting public image. Image Affects Stock Price Perhaps the most important aspect of public image is that public image can affect the share price. At least two sets of agents are concerned with the
210
DANIEL E. O’LEARY
impact of image on stock price, each with different rationales. Managers generally are interested in ‘‘managing’’ a firm’s public image, so that stock price is maximized, typically because their compensation is based on the stock price and top management ultimately is evaluated on stock price. As an example, firms are thought to manage earnings, so that their public image is maintained (e.g., Hall, 1993). Accordingly, managers would be interested in monitoring and forecasting the impact of other aspects of public image, beyond just earnings. Furthermore, stockholders and firms that specialize in financial markets are interested in the impact of public image on stock price. As noted by LaMonica (2005), apparently one of the first firms interested in public image monitoring and forecasting was Morgan Stanley, a financial services firm. If it is possible to monitor a firm’s public image and to forecast the impact of changes of the public image in real time, we might be able to understand and forecast the impact on stock price. If such information could be gathered, it could be leveraged and potentially result in gains in the stock market.
Image Affects Buying Decisions As noted by an IBM director of strategy and business development for unstructured information (LaMonica, 2005), ‘‘Organizations are struggling to understand what people are saying about them in public. That ends up having an impact on opinion and buying decisions.’’ At the extreme, the wrong public opinion can result in events such as boycotts of firms. For example, in April 2010, there were over 1,600 entries in response to a Google search through blogs for ‘‘boycott Walmart.’’ If firms can know in real time what negative (or positive) affects are having on consumer buying decisions, they may be in a position to gauge the severity of the events and to mitigate or leverage the impact of those events.
Public Image Influences Brand Reputation As noted by an IBM vice president (IBM, 2005), ‘‘Companies are seeking new ways to better understand how they are viewed by customers, investors and other stakeholders who have an impact on their brand reputation y (public image monitoring and forecasting) can help clients track and analyze the pulse of the public in real-time, allowing organizations to be more responsive and deliver better service to their customers.’’
Using Digital Media to Monitor and Forecast a Firm’s Public Image
211
However, as noted by MacDonald and Sharp (2003), although important, brand monitoring is expensive and is often ignored by management despite its importance. As a result, building a system to facilitate monitoring and forecasting of image can perform an important role by keeping brand image in front of management in real time.
Impact of Law Suits on Public Image Information about actual and potential law suits facing a company can be apparent from both internal and external sources. Internally, documents, emails, blogs, etc. could be examined to determine whether there is any behavior or information that may lead to a law suit. For example, analysis of internal information at Goldman Sachs may have been helpful at mitigating some of their recent problems (e.g., Popper, 2010). Externally, digital scuttlebutt can lead to an understanding of potential litigation. Reportedly (Clarke, 2005) Wal-Mart has established a ‘‘war room’’ to keep track of digital conversations about it. In particular, Wal-Mart apparently has become source in a number of class action law suits over alleged discrimination and other concerns. For example, as noted in Clarke (2005) and Anonymous (2009), Wal-Mart has been criticized over a claimed policy of hiring healthy staff while providing workers inadequate coverage, inhibiting staff unionization, and has been accused of negatively influencing local businesses. Since law suits can be costly to defend and have a negative impact on public image, it can be critical to constantly monitor for information about law suits. Accordingly, it is likely that early understanding of the potential for law suits is of interest to management.
Marketing Campaign Success As noted in IBM (2005), companies are interesting in tracking the success of marketing campaigns and new product introductions. Analyzing digital information in blogs etc. can help them gather information in real time without using focus groups, comparing information gathered from blog comments or discussion groups and actual sales data. As a result, firms could compare qualitative and quantitative data to see how well such efforts are actually working, in a real time environment, without the cost and time delay of forming groups of consumers, putting together questionnaires, etc. and gathering information from them.
212
DANIEL E. O’LEARY
WHY MONITOR DIGITAL MEDIA (‘‘SCUTTLEBUTT’’)? One of the original news reports (LaMonica, 2005) about IBM’s efforts to develop a system to monitor public image indicated that IBM was going to ‘‘analyze digital scuttlebutt.’’ What is digital scuttlebutt? Scuttlebutt refers to opinion, rumor, and gossip. As a result, ‘‘digital’’ scuttlebutt refers to digital sources of opinion, rumor, and gossip. For example, digital scuttlebutt would refer to blogs, discussion groups, product evaluations, etc. So using the term ‘‘digital scuttlebutt’’ suggests that monitoring and forecasting public image can employ a number of different types of digital media, analyzing those driven by individuals’ creation of content. In the early days of the web, most of the web information was being generated by universities, governments, and business, because they were the only ones with access to servers, application software, etc. However, now more than half of the information content on the web is coming directly from individuals (Haley, 2005). As a result, now people can rapidly share their opinions with others. Thus, companies need to be able to understand what key opinions are being given, and they need to interpret them in real time. Blogs are one of the emerging types of media that individuals can use to communicate with others and express their opinions, and public image is driven by opinion. There are a number of blog search engines include ‘‘Blog Search Engine’’ (http://www.blogsearchengine.com/), Google Blog Search (http://blogsearch.google.com/), and Technorati (http://technorati.com/) indicating the increasing importance of blogs and the ability to gather information from blogs. As noted in Clarke (2005), an IBM executive commented ‘‘People are more likely to spout an opinion on a blog than call a company and complain. Organizations are starting to learn about what potential issues consumers are having with their companies and services. That market is difficult for companies to actively monitor.’’ However, attention must be paid to more than gathering opinion and comments from just blogs. In addition, there are thousands of discussion groups on hundreds of different topics. Google Groups (http://groups. google.com/) provides the ability to search these many different discussions. Furthermore, there are thousands of individual product reviews on sites like Amazon, and other forms of digital information provide a means for people to ‘‘shout to the world’’ about some issues. Finally, critical information can appear in formal news articles. As a result, a wide range of data sources must be investigated to get the necessary opinion information.
Using Digital Media to Monitor and Forecast a Firm’s Public Image
213
WHAT SOURCES OF INFORMATION NEED TO BE MONITORED? Apparently (Anonymous, 2005), IBM’s system that was ultimately designed to troll through publicly available blogs, discussions, etc., first started as a tool designed to analyze corporate data from the perspective that internal documents and materials needed to be searched and analyzed. However, they found that companies were also interested in their public image as seen from external sources. Accordingly, one approach to investigate the scope of information that needs to be investigated is to break the information into external and internal sources.
External Substantial quantities of information about firms are captured on the Internet in blogs and other sources relating to a number of different entities. For example, a Google Blog search in April 2010 found 43,893,109 references to Microsoft and 8,532,282 references to IBM. Furthermore, a Google Group search at the same time led to 121,000,000 group references to Microsoft and 23,900,000 references to IBM. Although it is unlikely that all those blogs and group references provide important information, it would still appear that there is substantial potential information available in blogs and discussion groups. In addition, the fact that there are multiple millions of entries indicates that it would be impossible for a person to review each item and that content analysis is only feasible if a digital system were in place to investigate the information.
Internal Although most of our discussion has been on external sources of information, it is likely that information generated internally can be reflective of potential issues that can influence a firm’s public image. For example, perhaps such a system may have been helpful to Goldman, Sachs & Co. and the recent controversy in the news (e.g., Popper, 2010). Such a system could peruse emails, internal blogs, and a range of company documents. Furthermore, such an investigation would be particularly important since many of the forms of information generated in external settings are
214
DANIEL E. O’LEARY
beginning to be generated in internal settings, as firms adopt the use of blogs and discussion groups for internal purposes. Accordingly, much enterprise information is in general text format. Thus, as noted by Mills (2005), roughly 85% of an enterprise’s data is unstructured. Accordingly, it is probably not surprising that employees spend roughly 30% of their time looking for information. Again, it is not likely to be feasible that a person could examine all the available text to try to find the information necessary for monitoring and forecasting public image.
HOW CAN FIRMS USE BLOGS AND OTHER DIGITAL MEDIA TO MONITOR AND FORECAST PUBLIC IMAGE? This section investigates some system characteristics that could allow firms to begin to use blogs, discussion groups, news paper articles, and other types of digital media to monitor and forecast public image. Multiple Types of Unstructured Data To have a system that allows analysis of multiple kinds of unstructured data requires the ability to process multiple formats of text and to integrate across applications. As a result, the original design for IBM’s public image monitoring system employed ‘‘unstructured information management applications’’ (UIMA), software designed to analyze unstructured data designed to discover knowledge from data. UIMA employs Extensible Mark-up Language (XML) with self-describing meta data (e.g., Gotz & Suhre 2004 and http://uima.apache.org/). Search: Key Word Instances vs. Identifying Actual Data A key aspect of any system designed to find information for a firm’s public image is the ability to search unstructured information. A standard keyword search is a classic approach used to find and return specific instances of a word on a digital page. However, the approach used by IBM and others in their work on monitoring public image has been to capture the actual data associated with the specific key words. For example, in the case of the word ‘‘phone,’’ not only the word, but the phone numbers would be captured and
Using Digital Media to Monitor and Forecast a Firm’s Public Image
215
returned for entities. Hampp and Lang (2005) discuss the basic platform used by IBM in semantic search. They also suggest that search can ultimately be tailored to specific industries such as manufacturing, services or government, to fully leverage the ability to spot particular events, agents, times, and locations. Accordingly, in these settings, domain-specific knowledge can be leveraged to facilitate disambiguation of terms and better understand context.
Capturing Sentiment in Digital Media Although search is critical, once information has been found, there is a need to determine the sentiment or mood associated with key concepts or entities: are these positive or negative comments about a firm’s public image? There are a number of approaches aimed at trying to gauge the sentiment in digital media. We examine two that have appeared in the literature, based on different data sets. One approach to monitor public image is to attempt to gauge sentiment of a digital disclosure by analyzing the words that are used in the particular discussion (e.g., Attardi & Simi 2006). Using this approach, some words can be categorized as ‘‘positive’’ terms, whereas others would be considered as expressing ‘‘negative’’ terms. Once words are tagged as positive and negative, then a proximity search can be done to determine how close those words are to the particular concept of concern. Yang, Si, and Callan (2006) provide some examples of positive verbs (love, like), negative verbs (hate, dislike), and negative adjectives (bad, awful, suck). Unfortunately, this approach can be misleading and does not always provide the desired results. For example, ‘‘love’’ is a positive sentiment term in general. However, in an April 2010 search, I found the expression ‘‘I love Microsoft bashing,’’ which is not positive. Such apparently contradictory instances might be characterized as having two (or more) contradictory sentiment words in proximity to each other. Continuing with the example, there is a positive word (‘‘love’’) and negative words (‘‘bash’’) within one word of the entity ‘‘Microsoft.’’ The contradictory words do not cancel each other out, but instead result in a specific sentiment that in this case is negative. Another approach is to use tags provided by users or readers of the materials to determine sentiment. Tag information can also be categorized as positive or negative using the same approach as directly examining content. For example, using ‘‘Delicious’’ users apply tags to external digital
216
DANIEL E. O’LEARY
information that could be categorized. This approach has also found use within firms, where such generation of tags was done on internal text materials (e.g., Blumenstein & Burgelman, 2007). Unfortunately, this approach depends on humans to provide those tags and assumes that those tags are correctly and consistently applied and that users have supplied all the appropriate labels. Unfortunately, although people typically provide important information in their tagging, instances of vandalism and misinterpretation can occur. As an example, in some cases tagging has resulting in instances where, for example, spiders have been tagged as ‘‘insects’’ instead of ‘‘arachnids.’’
Measuring Change in Sentiment of a Company’s Public Image Once the sentiment data (e.g., determining positive/negative words on content or tags) has been captured describing numbers of positive and negative events, there are a number of ways to measure changes in sentiment, also referred to as ‘‘mood’’ in the literature. For example, Mishe and de Rijke (2006) developed three applications to analyze the mood in blogs, based on tags provided by users: Moodgrapher that tracks the global mood levels, Moodteller predicts them, and Moodsignals tries to facilitate understanding the underlying reasons for mood changes. Those same approaches can be used with information derived about the digital scuttlebutt. By capturing the number of documents (e.g., blogs or discussions, etc.) that have a particular sentiment or mood, we can construct a time series that captures and tracks those instances over time. Typically, this would mean having both a positive and a negative measure associated with each day over some time period. In any case traditional time series forecasting methods can be used to analyze that data and forecast using that data. Changes in sentiment or mood can then be noted by peaks or valleys in the number of documents captured as representative of positive or negative sentiment. For those chunks of time where positive or negative comments are particularly high or low, direct inspection of particular text or additional text mining can be done to determine frequently occurring words or phrases that are associated with the digital sources. By focusing on the items where there was the greatest change, the potential rationales for that change could become apparent.
Using Digital Media to Monitor and Forecast a Firm’s Public Image
217
EXPECTATIONS FROM A PUBLIC IMAGE MONITORING AND FORECASTING SYSTEM What should we expect from such a system? As noted by an IBM executive (Clarke, 2005), ‘‘(The results are) not going to be 100 per cent ‘noise’ free. You make tradeoffs: do you want to get everything that’s relevant or do you want to miss some things? The feedback I’m hearing from companies is, if you can get them 50 per cent or 80 percent there that makes a huge impact. Instead, Public Image Monitoring y should be used as a supplement to actually reading blogs, he observed.’’ Unfortunately, as we noted earlier, there may be literally millions of blogs, discussion groups, or product evaluations. Furthermore, firms are generally trying to act in real time. As a result, it could be difficult to have a human read all the appropriate material. Accordingly, it could become important to be able to isolate those blogs or discussion groups where more negative or positive activity was coming from, and those blogs or group discussions that are more optimistic than others. Classic statistics such as number of visitors to a site could provide some insight to such characterizations.
SUMMARY AND EXTENSIONS The purpose of this section is to provide a brief summary of the chapter and discuss some extensions. Summary This chapter has investigated the notion of public image monitoring and forecasting based on using a range of digitally available information. Such public image monitoring can be critical to firms for a number of reasons, ranging from its impact on stock price to influencing buying behavior to monitoring potential legal ramifications. Digitally available information such as blogs can be used for a number of reasons including the fact they increasingly are being used to express opinions about companies and other issues. This chapter found that the number of positive and negative items (e.g., user supplied tags or computer generated lists based on words used as verbs or adjectives) can be used as the basis of monitoring and forecasting the public image of a company.
218
DANIEL E. O’LEARY
Extensions This chapter can be extended in a number of different ways. First, the search could move beyond generic public image and focus specifically on either internal or external events. As an example, the analysis could examine a particular concern related to public image such as ‘‘fraud,’’ ‘‘merger,’’ or ‘‘acquisition.’’ Second, alternative approaches for determining sentiment, positive or negative or resolving apparent conflicts, could be examined. The research discussed in this chapter was based on positive and negative vocabularies of sentiment content words and tags. Alternatively, different approaches based on ‘‘semantic concepts’’ and other approaches might be used. Third, the nature of the relationship between public image as measured by sentiment in digital databases, and stock price is unclear. Establishing the existence of such a relationship could prove extremely valuable. Accordingly, future research could be done to better understand that relationship. Such relationships might proceed by analyzing the relationships suggested above trying to map out a relationship between public image and stock price performance. Fourth, the discussion in this chapter effectively treated number of occurrences of positive or negative sentiment in blogs, discussion groups, and news articles as the same. Since each of those is not likely to have the same impact, it may be more appropriate to keep each set of different kinds of digital information separate and investigate the data from each separately. Further research could examine the extent of impact that each have on the issues at hand, using a multivariate analysis. Fifth, it may be that some blog or discussion group is more influential than other sources. Although a priori, it probably is impossible to know which are the most influential; other statistics such as number of visitors can prove helpful in guiding the user to the more important influences. Sixth, ‘‘public image’’ is not really a well-defined concept. It likely has many facets reflected in the rationales for using such a system. Future research might try to tease out individual components related to issues such as legal events, products, etc.
REFERENCES Anonymous. (2005). IBM to analyze digital scuttlebutt. Available at http://www.mtc-ic.com/en/ NewsShow.asp?Id ¼ 294. Retrieved on November 9, 2005. Anonymous. (2009). Barstow California Wal-Mart settles distribution center lawsuits. Available at http://blog.wakeupwalmart.com/ufcw/court_of_public_opinion/index.html
Using Digital Media to Monitor and Forecast a Firm’s Public Image
219
Attardi, G., & Simi, M. (2006). Blog mining through opinionated words. The 15th Text Retrieval Conference Proceedings. Available at http://trec.nist.gov/pubs/trec15/papers/ upisa.blog.final.pdf Blumenstein, B., & Burgelman, R. (2007). Knowledge management at Katzenbach partners LLC. Stanford Graduate School of Business, Case SM-162, June 25, 2007. Clarke, G. (2005). IBM targets brand conscious with search. The Register. Available at http:// www.theregister.co.uk/2005/11/09/ibm_blog_brand_software/. Retrieved on November 9, 2005. Gotz, T., & Suhre, O. (2004). Design and implementation of the UIMA common analysis system. IBM Systems Journal, 43(3), 476–489. Haley, C. (2005). Blog spotting with IBM. Available at http://www.internetnews.com/xSP/ article.php/3562116. Retrieved on November 7, 2005. Hall, S. C. (1993). Political scrutiny and earnings management in the oil refining industry. Journal of Accounting and Public Policy, 12(4), 325–351. Hampp, T., & Lang, A. (2005). Semantic search in WebSphere integrator OmniFind edition: The case for semantic search. Available at http://www.ibm.com/developerworks/data/ library/techarticle/dm-0508lang/. Retrieved on August 5, 2005. IBM. (2005). IBM tracks blogs, web content to capture buzz, spot trends around companies, products and marketing campaigns. Available at http://www-03.ibm.com/press/us/en/ pressrelease/7961.wss. Retrieved on November 7, 2005. LaMonica, M. (2005). IBM to analyze digital scuttlebutt. Available at http://news.cnet.com/ IBM-to-analyze-digital-scuttlebutt/2100-1012_3-5940339.html. Retrieved on November 8, 2005. MacDonald, E., & Sharp, D. (2003). Management perceptions of the importance of brand awareness as an indication of advertising effectiveness. Marketing Bulletin, 14. Available at http://marketing-bulletin.massey.ac.nz/V14/MB_V14_A2_Macdonald.pdf Mills, E. (2005). IBM dives deeper into corporate search. Available at http://news.cnet.com/ IBM-dives-deeper-into-corporate-search/2100-7344_3-5820938.html?tag ¼ mncol;txt. Retrieved on August 7, 2005. Mishe, G., & de Rijke, M. (2006). MoodViews: Tools for blog mood analysis. Menlo Park, CA: American Association for Artificial Intelligence. Popper, N. (2010). Goldman trader in spotlight. Los Angeles Times, April 27, pp. B1, B5. Yang, H., Si, L., & Callan, J. (2006). Knowledge transfer and opinion detection in the TREC 2006 blog track. Available at http://trec.nist.gov/pubs/trec15/t15_proceedings.html
EVALUATING SURVIVAL LIKELIHOODS IN PALLIATIVE PATIENTS USING MULTIPLE CRITERIA OF SURVIVAL RATES AND QUALITY OF LIFE Virginia M. Miori and Daniel J. Miori ABSTRACT Palliative care concentrates on reducing the severity of disease symptoms, rather than providing a cure. The goal is to prevent and relieve suffering and to improve the quality of life for people facing serious, complex illness. It is therefore critical in the palliative environment that caregivers are able to make recommendations to patients and families based on reasonable assessments of amount of suffering and quality of life. This research uses statistical methods of evaluation and prediction as well as simulation to create a multiple criteria model of survival rates, survival likelihoods, and quality of life assessments. The results have been reviewed by caregivers and are seen to provide a solid analytical base for patient recommendations.
Advances in Business and Management Forecasting, Volume 7, 221–238 Copyright r 2010 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2010)0000007018
221
222
VIRGINIA M. MIORI AND DANIEL J. MIORI
INTRODUCTION Palliative medicine is defined as ‘‘Palliative care is an approach that improves the quality of life of patients and their families facing the problem associated with life-threatening illness, through the prevention and relief of suffering by means of early identification and impeccable assessment and treatment of pain and other problems, physical, psychosocial and spiritual.’’ The practice of palliative care within is growing substantially. Rather than providing a cure, palliative care is intended to reduce the severity of disease symptoms to improve quality of life. When a palliative patient goes into cardiac arrest, medical practitioners are responsible for determining the duration of resuscitation attempts. To date, intuition has been the most utilized toll to make this determination. The object of this research was to develop a quantitative method to estimate survival rates and survival likelihoods to assist medical practitioners. Resuscitation is typically indicated in two primary instances, those of respiratory arrest and cardiac arrest. Respiratory arrest is generally considered to be the cessation of breathing and typically coincides with cardiac arrest. Cardiac arrest, however, is well defined as the cessation of normal circulation of the blood due to failure of the heart to contract effectively (Jameson et al., 2005). We focus in this chapter on patients who have gone into cardiac arrest. In these cases, there is no guarantee that an attempt at resuscitation will restore complete or even partial function to the patient. The medical response to cardiac arrest relies on a complex combination of medications, treatments, and procedures delivered in an aggressive fashion, which is known in medical circles as a code. Resuscitation is not attempted for patients who maintain a Do Not Resuscitate (DNR) order. Survival rates following cardiac arrest are quite low, but as family members, we often cling to the small hope of that recovery. Patients who arrest outside of a hospital environment have a worse survival rate than those patients who arrest in hospitals. Approximately 14–22% survive to admission with only 6–8% surviving to discharge. Those who arrest in hospitals have approximately a 14–28% survival rate to discharge (Eisenberg & Mengert, 2001; Cobbe, Dalziel, Ford, & Marsden, 1996; Ballew, 1997). The nature of the heart rhythm at cardiac arrest also has a significant impact on survival likelihood. For example, patients who experience pulseless electrical activity or asystole have even lower survival rates (Eisenberg, 2001). Enter the palliative patient with other lifethreatening chronic conditions (comorbidities) and this survival rate decreases even more.
Evaluating Survival Likelihoods in Palliative Patients
223
We seek to examine characteristics of palliative care patients who have gone into cardiac arrest and develop a procedure to effectively predict survival rates of these patients. The ability to quantitatively assess survival rates will provide substantial supporting information to medical practitioners in advising patients and their families. The analysis in our chapter begins with examination of data collected from palliative care patients. We first fit statistical distribution to represent the likelihoods of individual comorbidities. The analysis continues with the development of a logistic regression using the binomial characterization of survival as the response variable. The logistic regression and analyses culminate in the presentation of a tool to predict survival rate based on patient age and medical condition.
LITERATURE The literature encompasses several areas. We first discuss some of the extensive research on communication with families of palliative patients. Although this chapter does not focus specifically on communication, this collected research does emphasize the difficulties faced by doctors, patients, and families alike. The development of a survival forecasting methods will significantly impact the nature and quality of these conversations. We also present published research on the nature of cardiac resuscitation efforts and specific research into survival rates. We then examine published literature surrounding palliative care and cardiac arrest in palliative care patients. The final section discusses forecasting techniques used in survival rate and other similar predictions.
Decision-Making and Communication Charlesa, Gafnia, and Whelana (1999) identify steps in the treatment decision-making process and present three different models of decision making: paternalistic, shared, and informed. They emphasize the importance of flexibility in decision making which allows for a dynamic view, recognizing that the initial approach to treatment may change as interactions with patients and families evolve. Decision-making conversations generally occur when the goal of treatment should be changed. They provide the assimilation of negative evidence in support of that change, whereas non-decision-making conversations tend
224
VIRGINIA M. MIORI AND DANIEL J. MIORI
to provide positive information and messages of hope. Although communication is not the focus of this chapter, it is in fact a necessary outgrowth of our research and may be further refined based on the use of this research. Azoulay et al. (2000) focused on these communications as a critical element in quality care. Curtis et al. (2001) also emphasize the importance of communication and provide guidance for family conferences and the role that medical professional may take in improving the communication that occurs. Whitney (2003) proposed a decision-making structure in which patients were encouraged to be the primary decision maker in major decisions with low certainty, using the physician as a basis. Minor decisions with high certainty should typically be left to the physician, and major decisions with high certainty are seen to be the greatest source of conflict between physicians and patients when disagreement occurs. The physician’s view of patients’ participation in decision making was researched by McGuire, McCullough, Weller, and Whitney (2005). Those physicians in the study expressed consistently positive attitudes toward patient participation in decision making, citing patient autonomy as an important aspect of care. The decision-making process did however vary based on patient, physician, and environmental factors. The nature of discussion between physicians, families, and patients has been examined in great detail, although most studies in this arena have been performed on a retrospective basis. Researchers are endeavoring to extend their work to reflect actual end-of-life discussions. Aldridge and Barton (2007) have considered previous approaches taken and have attempted to explain the subtle differences that exist between conversations.
Surviving Cardiac Arrest Substantial research has been done on improving survival after cardiac arrest, although this research has a very limited focus when considering terminal or palliative patients. A sample of recent research in this area is presented. Bunch, White, Khan, Packer, and Douglas (2004) examined the effect of age on survival after out-of-hospital cardiac arrest. They found that survival rates were improved by the presence of a rapid defibrillation program. The survival rates were reflective of a nonspecific population and did not specifically consider disease or preexisting conditions. Aufderheide, Pirrallo, Provo, and Lurie (2005) and Aufderheide et al. (2008) studied the physiology of cardiopulmonary resuscitation (CPR) in an
Evaluating Survival Likelihoods in Palliative Patients
225
effort to improve survival rates in out-of-hospital cardiac arrest cases. They proposed the use of an impedance threshold device with renewed emphasis on compressions and reduced quantity of ventilations, which resulted in higher survival rates. In all of these cases, survival rates were calculated after the fact, providing no real insights into the forecasting of survival rates. The use of therapeutic hypothermia in patients after experiencing cardiac arrest was examined by Tiainen et al. (2009). Patients were cooled to 331C and then allowed to warm over the next 24 hours. These patients did not experience an increase in clinically significant arrhythmias, but their electrocardiograms at 24 and 48 hours may be a predictor of favorable outcomes in cardiac arrest patients treated with hypothermia. White, Blackwell, Russell, and Jorgenson (2004) determined that body weight was not a predictor of survival in patients with out-of-hospital cardiac arrest in cases where a defibrillator was used. These patients were determined to have shockable rhythms and were initially treated by basic life support personnel. Meaney et al. (2010) also explored the nature of rhythms as shockable or not shockable in patients experiencing out-ofhospital cardiac arrest. They determined that the shockable rhythms were a significant factor in survival to discharge.
Palliative Patients Much of the research in the palliative care arena also focuses on communication and ethical care. Garland and Connors (2007) performed an observational study of physicians in an adult medical intensive care unit of a public teaching hospital. They collected data on individual physicians and the influence they had over decision to forego life support. Few Americans have health care directives, and therefore, the decision to forego life support is left to families under the guidance of physicians. The association of the decision made with physician opinion was well established in this work. The degree and implications of the association are not yet clear and require additional research. Discussions about resuscitation and goals of care in palliative patients are not currently subject to any guidelines. The Delphi method has been proposed by Downar and Hawryluck (2010) to develop a series of nine consensus statements to determine the content of an end-of-life discussion. The statements focus on life-sustaining therapy within the concept of palliative care. The method also provides for changes in the conversation
226
VIRGINIA M. MIORI AND DANIEL J. MIORI
after cardiac arrest. The statements can serve as guidelines for physicians to facilitate effective, informed, and ethically sound decision making.
Forecasting Survival Rates In medical studies, survival rates are typically examined in opposition or as mortality rates. A number of different forecasting methods have been explored in predicting survival rates in various settings. The approaches range from very straightforward use of statistical tools to nonparametric methods. Zaroff, diTommaso, and Barron (2002) used an approach similar to that presented in this chapter. They used logistic regression to quantify the effects of the independent variables on coronary artery bypass grafting (CABG) patients. They examined the effects of recent myocardial infarction, age, gender, and other clinical factors. A risk score was generated and the findings indicated that age, previous CABG, heart failure on presentation, and female gender were all predictors of heightened risk. Log-linear Poisson models were used by Pe´rez-Farino´s, Lo´pez-Abente, and Pastor-Barriuso (2006) in predicting variations in mortality rates among kidney cancer patients across Europe. Hariharan, Saied, and Kocher (2008a, 2008b) analyzed mortality rates for pancreatic and gallbladder cancer patients. In both cases, the authors used log-linear regression analysis to examine a decade of worldwide data from the International Agency for research on cancer. Their research was also significant in that it assessed mortality from these types of cancer in 50 countries. Time-dependent point correlation dimension, an automated nonlinear approach, was taken by Skinner, Anchine, and Weiss (2008) to analyze the heartbeats of cardiac patients and assess their risk of arrhythmic death. This work was later continued when Skinner et al. (2009) compared linear stochastic and nonlinear deterministic forecasting models on their ability to predict risk of arrhythmic death in cardiac patients. The linear stochastic approaches included use of mean and standard deviation of normal heartbeats, power spectrum normalized by log transformation, whereas the nonlinear methods included point correlation dimension, detrended fluctuation analysis, and approximate entropy. Erbas et al. (2010) examined breast cancer mortality in the United States, England, and Wales. The researchers developed mortality age curves that were based on plots of age and mortality associations. After taking the log of the mortality rates (their outcome), functional data analysis techniques
Evaluating Survival Likelihoods in Palliative Patients
227
were used to model the log curves. This included nonparametric methods such as penalized regression splines and Loess curves. The functional data analysis methods were selected specifically because of their strength in providing information about curves or other surfaces that vary over a continuum. The longevity of Welsh and English males was studied by Down, Blake, and Cairns (2010). Fan charts that were based on a mortality model calibrated on mortality data were used. These charts show central projection based on typical measures of central tendency. A two-factor mortality model was used in the ultimate creation of the charts, in both deterministic and stochastic frameworks. Creative uses of forecasting models have been employed extensively in the examination of mortality rates. These approaches all represent the desire to more adequately anticipate the condition of patients. The extended discussion on communication and decision making are a necessary followon to all of the mortality forecasting already completed. We move forward in this chapter and discuss the available data and subsequent analysis methods used to predict likelihood of survival in palliative patients who have gone into cardiac arrest. Rather than discuss the increased risk, this discussion is couched in survival likelihoods. The chapter concludes with simulation methods designed to assist in physician assessments of patients and the ensuing care decisions.
DATA ANALYSIS Procedure The historical data included five survival metrics, three of which are examined in this analysis. The metrics are ‘‘survived code,’’ ‘‘survived 24 hours,’’ and ‘‘survived to discharge’’; each includes a binary indication of survival or lack of survival. The remaining two metrics include further information on the quality of life when patient survival reaches 6 months and 1 year. Not only were these incidences of survival small in numbers, but the ability to obtain reliable follow-up data from patients was severely compromised making the data unreliable. The age of the patient, duration of the resuscitation effort, and length of stay in the hospital are the quantitative variables present, whereas the remaining variables of interest are binary. These variables reflect the existence of certain conditions in patients, known as comorbidities. They include
228
VIRGINIA M. MIORI AND DANIEL J. MIORI
hypertension, diabetes mellitus, congestive heart failure, myocardial infarction, history of myocardial infarction (heart attack), cerebral vascular accident (stroke), cerebral vascular accident history, on vent, on pressors, pneumonia, sepsis, coronary artery disease, acute renal failure, chronic renal failure, chronic obstructive pulmonary disease, and atrial fibrillation. The initial set of data contained 42 records gathered from a specific geographic area. Although the sample size is not sufficient to verify the results on a large-scale basis, it is sufficient to verify the techniques being presented. Additional records are being gathered and will be used to extend this research and verify standard survival rate results. On the basis of the initial set of data, logistic regression (which is characterized by a binary dependent variable) was selected as an appropriate analysis tool. A series of regressions were run using surviving the code, surviving 24 hours and surviving until discharge as the dependent variables. The independent variables included patient age, duration of the resuscitation effort, and length of hospital stay as well as the comorbidities stated above. Researchers designated particular comorbidity interactions that have been anecdotally observed as significant negative indicators of survival. These interactions are known to the authors as double whopper with cheese (DWWC). The logistic regression results include the variable significance levels, variable coefficients, percentage increase in likelihood of increased survival based on changes in independent variables, and an ability to calculate actual survival likelihood. We may then simulate individual survival based on patient characteristics.
Fitting Statistical Distributions We begin by focusing on surviving the cardiac resuscitation. A parallel analysis was completed for the two additional survival variables: survive 24 hours and survive to discharge. We sought again to validate the procedure in anticipation of a larger data set. The duration of the resuscitation effort and age of the patients have always been viewed as strong indicators of patient condition and patient survival likelihood. Inevitably, the question of how long a code lasted has been a point of discussion among medical practitioners and families alike. In reality, length of a code most likely has causal links to the physician’s years of experience, specialty, and demonstrated history with patients in need of cardiac resuscitation. The difficulty in gathering data of this nature
Evaluating Survival Likelihoods in Palliative Patients
Fig. 1.
Fig. 2.
229
Histogram of Duration of Cardiac Resuscitation Effort.
Duration of Cardiac Resuscitation Effort by Age of Patient.
leaves much of this effect to the error term. Therefore, we focus on the specific relationships between survival, age of the patient, and the duration of the resuscitation effort. Fig. 1 presents a histogram of these durations. Fig. 2 presents the duration of the resuscitation effort by age of patient. Crystal Ball was used to fit distributions to the quantitative variables in the data. These distributions were then carried forward in the survival simulation. Fig. 3 shows the fitted distribution of the duration of the cardiac resuscitation effort, the best fit using a Chi-Square Goodness-of-Fit test being a logistic distribution. As shown in Fig. 4, the student’s t-distribution provided a stronger fit for code durations less than 50 minutes. Ultimately, the question of fit will be better resolved with the extended data set.
230
Fig. 3.
Fig. 4.
VIRGINIA M. MIORI AND DANIEL J. MIORI
Duration of Cardiac Resuscitation Fitted to a Logistic Distribution.
Duration of Cardiac Resuscitation Fitted to a Student’s t-Distribution.
Evaluating Survival Likelihoods in Palliative Patients
Fig. 5.
231
Comorbidity Frequency.
Fig. 5 provides frequencies of comorbidities in all patients who underwent cardiac resuscitation, stratified by survival. By examining these frequencies, we were able to begin formalizing the logistic regression for our small sample size. As more data become available, our ability to find high significance levels will increase. The high frequency of hypertension in patients indicated a significant risk factor but does not necessarily indicate a causal link, particularly due to the high incidence of hypertension in the both surviving and nonsurviving patients. The low frequency of pneumonia and on-pressors were insufficient to act as significant predictors. The comorbidities that exhibited higher incidence in nonsurviving patients were expected to show significance in the logistic regression.
Logistic Regression The initial logistic regression runs used ‘‘survive code,’’ survive 24 hours,’’ and ‘‘survive to discharge’’ as dependent variables. The variables take on a value of zero (0) to denote no survival and a value of one (1) to denote survival. The independent variables were age, length of code, length of stay in the hospital, and individual comorbidities. Subsequent runs include these variables and single interactions of comorbidities.
232
VIRGINIA M. MIORI AND DANIEL J. MIORI
Logistic regression not only produces a binary outcome, suggesting survival or lack of survival, but also provides and odds ratio for each independent variable. The odds ratios indicate the contribution of that variable to the overall likelihood of a positive, or survival, outcome. It reflects the impact of a change in only that independent variable. The accumulation of odds ratios therefore ultimately enables the calculation of the overall probability of an outcome, either positive or negative. This probability is in fact the survival likelihood. The odds ratios for these variables provide important insights into the impact of the variables. The odds ratio for age actually indicates that the likelihood of surviving cardiac resuscitation increases with advanced years. This anomaly has been observed anecdotally, with the results of the regression providing the first indication of a demonstrable link. One reason for this anomaly has been postulated. Palliative patients who have reached a more advanced age have lived for a greater number of healthy years. Thus, their bodies may in fact be stronger than those patients who have reached a palliative stage at an earlier age. The odds ratios indicate that the incidence of congestive heart failure and history of cerebral vascular accident decreased the patient’s change of survival. The odds ratio on length of stay indicated that longer hospital stays would increase survival probabilities. This relationship may however represent a reversal of cause and effect. Lengthier hospital stay may in fact be associated with patients who survive cardiac resuscitation. The logistic regressions presented in this chapter are the product of a series of regressions. Due to the limitations of our initial data set, the categorical (binary) variables representing the incidences of pneumonia and patients ‘‘on pressors’’ were eliminated from the analysis. The regression and goodness of fit for ‘‘survive code’’ are provided in Fig. 6. The variables with the highest level of significance are age, dm (diabetes mellitus), chf (congestive heart failure), cv (cerebral vascular accident), cvhx (history of cerebral vascular accident), and stay. The statistical fit was quite strong based on the chi-square test that showed 99% confidence in the fit. Fig. 7 provides the logistic regression outcome for ‘‘survive 24 hours.’’ This fit was also quite strong, showing a chi square with 98% confidence in the fit. The logistic regression performed on the dependent variable ‘‘survive to discharge’’ is presented in Fig. 8. The chi square test showed quite a weak fit, but this was not surprising given the very limited number of patients who survived until discharged. This regression will improve with additional data, although the number of palliative patients who have undergone cardiac
Evaluating Survival Likelihoods in Palliative Patients
Fig. 6.
Fig. 7.
Logistic Regression Output for ‘‘Survive Code.’’
Logistic Regression Output for ‘‘Survive 24 Hours.’’
233
234
VIRGINIA M. MIORI AND DANIEL J. MIORI
Fig. 8.
Logistic Regression Output for ‘‘Survive to Discharge.’’
resuscitation and survived to discharge may not ever be sufficient to achieve a strong fit. The inclusion of single-interaction terms, in these regressions, proved to provide very little additional information. We attribute this to two causes. The first, the existence of one comorbidity carried substantial influence over the likelihood of survival, thus diminishing the impact of the second comorbidity. The second cause again ties back to the limited size of the data set. We may see improvements in the significance of the interaction terms in the future. Impact of Small Data Set We have discussed throughout the chapter the impact of the small data set. We have been able to generate valuable regression results and anticipate strong relationships between variables developing with a larger set of data. The small data set however left us subject to a number of complications. Throughout the logistic regression analyses on multiple dependent variables, a number of variables were dropped due to collinearity. This condition occurred primarily with interaction terms. Some of the collinearities were also purely coincidental.
Evaluating Survival Likelihoods in Palliative Patients
235
Variables were eliminated from consideration because they predicted the outcome perfectly. Again, much of this complicating factor was a result of coincidence. It plays a very strong role in small data sets particularly when a binary variable has a limited number of positive outcomes. A single individual who happened to have pneumonia did not survive resuscitation, resulting in an absolute prediction of survival based on the existence of pneumonia. This simple yet easily misinterpreted relationship will change significantly with the larger data to be examined in he next phase of the research.
SIMULATION DESIGN The results of the logistic regression along with the initial fitted distributions combine to provide a strong start in developing simulations to predict survival probabilities. Two approaches to this simulation were taken. The first was the straightforward calculation of survival probability using outcomes of the logistic regressions. The calculations were built into a VBA tool that allowed users to enter all independent variable values associated with a particular patient into a small application screen. The resulting survival likelihood was immediately provided. This tool was made available to medical professionals within and outside of palliative medicine, for testing and validation. Upon examination, this tool was considered to be viable. As with all things dealing with the loss of human life, it will certainly never replace the considered judgment of a medical staff. It will however provide a solid basis for assisting in decision-making conversations with families of palliative patients. The second approach was the use of Monte Carlo simulation using Crystal Ball. The fitted distributions associated with the occurrence of the comorbidities, age, duration of the cardiac resuscitation, and length of hospital stay were established as stochastic predictors in the simulation model. The value of extending the simulation to the use of Crystal Ball was the ability to examine the range of possible outcomes and develop a broader understanding of survival probabilities. Clearly, this is a case when we are uncomfortable stating these probabilities with utter certainty. The human body is capable of many surprises; the Monte Carlo simulations allow us to capture these cases. The Monte Carlo simulation is however limited at this point. The fitted statistical distributions demonstrated lackluster fit due to having only 42 data points. The large variations had a significant impact on the
236
VIRGINIA M. MIORI AND DANIEL J. MIORI
simulation outcomes. The results were not as robust as the logistic regression results, but over time, the inclusion of additional data will contribute to a stronger tool.
CONCLUSION The evaluation of patient survival in a palliative care setting is of the utmost importance to medical practitioners. There is a lack of significant research supporting the intuition of these practitioners. Statistical research in this area not only provides additional information to those in the field of medicine but also provides valuable information to the families and loved ones of those in palliative care. This chapter represents a beginning, and though the addition of larger data sets and extended examination, the outcomes will provide stable and sold results. The purpose of this chapter was validation of the approach to examination of the data. The results are very promising and indicate additional value could well be achieved with more data.
REFERENCES Aldridge, M., & Barton, E. (2007). Establishing terminal status in end-of-life discussions. Qualitative Health Research, 17(7), 908–918. Aufderheide, T. P., Alexander, C., Lick, C., Myers, B., Romig, L., Vartanian, L., Stothert, J., McKnite, S., Matsuura, T., Yannopoulos, D., & Lurie, K. (2008). From laboratory science to six emergency medical services systems: New understanding of the physiology of cardiopulmonary resuscitation increases survival rates after cardiac arrest. Critical Care Medicine, 36(11), S397–S404. Aufderheide, T. P., Pirrallo, R. G., Provo, T. A., & Lurie, K. G. (2005). Clinical evaluation of an inspiratory impedance threshold device during standard cardiopulmonary resuscitation in patients with out-of-hospital cardiac arrest. Critical Care Medicine, 33(4), 734–740. Azoulay, E., Chevret, S., Leleu, G., Pochard, F., Barboteu, M., Adrie, C., Canoui, P., Le Gall, J. R., & Schlemmer, B. (2000). Half the families of intensive care unit patients experience inadequate communication with physicians. Critical Care Medicine, 28(8), 3044–3049. Ballew, K. A. (1997). Cardiopulmonary resuscitation. British Medical Journal, 314(7092), 1462–1465. Bunch, T. J., White, R. D., Khan, A. H., & Packer, D. L. (2004). Impact of age on long-term survival and quality of life following out-of-hospital cardiac arrest. Critical Care Medicine, 32(4), 963–967.
Evaluating Survival Likelihoods in Palliative Patients
237
Charlesa, C., Gafnia, A., & Whelana, T. (1999). Decision-making in the physician-patient encounter: Revisiting the shared treatment decision-making model. Social Science & Medicine, 49, 651–661. Cobbe, S. M., Dalziel, K., Ford, I., & Marsden, A. K. (1996). Survival of 1476 patients initially resuscitated from out of hospital cardiac arrest. British Medical Journal, 312(7047), 1633–1637. Curtis, J. R., Patrick, D. L., Shannon, S. E., Treece, P. D., Engelberg, R. A., & Rubenfeld, G. D. (2001). The family conference as a focus to improve communication about end-of-life care in the intensive care unit: Opportunities for improvement. Critical Care Medicine, 29(2), N26–N33. Down, K., Blake, D., & Cairns, A. J. G. (2010). Facing up to uncertain life expectancy: The longevity charts. Social Science Research Network: Demography (Working Paper), 47(1). The Pensions Institute. London, UK. Downar, J., & Hawryluck, M. D. (2010). What should we say when discussing ‘‘code status’’ and life support with a patient? A Delphi analysis. Journal of Palliative Medicine, 13(2), 185–195. Eisenberg, M. S., & Mengert, T. J. (2001). Cardiac resuscitation. New England Journal of Medicine, 344(17), 1304–1313. Erbas, B., Akram, M., Gertig, D. M., English, D., Hopper, J. L., Kavanagh, A. M., & Hyndman, R. (2010). Using functional data analysis models to estimate future time trends in age-specific breast cancer mortality for the United States and England–Wales. Journal of Epidemiology, 20(2), 159–165. Garland, A., & Connors, A. F., Jr. (2007). Physicians’ influence over decisions to forego life support. Journal of Palliative Medicine, 10(6), 1298–1305. Hariharan, D., Saied, A., & Kocher, H. M. (2008). Analysis of mortality rates for gallbladder cancer across the world. HPB: The Official Journal of the International Hepato Pancreato Biliary Association, 10, 327–331. Hariharan, D., Saied, A., & Kocher, H. M. (2008). Analysis of mortality rates for pancreatic cancer across the world. HPB: The Official Journal of the International Hepato Pancreato Biliary Association, 10, 58–62. Jameson, J. N. St. C., Kasper, D. L., Harrison, T. R., Braunwald, E., Fauci, A. S., Hauser, S. L., & Longo, D. L. (2005). Harrison’s principles of internal medicine. New York: McGraw-Hill Medical Publishing Division. McGuire, A. L., McCullough, L. B., Weller, S. C., & Whitney, S. N. (2005). Missed expectations? Physicians’ views of patients’ participation in medical decision-making. Medical Care, 43(5), 466–470. Meaney, P. A., Nadkarni, V. M., Kern, K. B., Indik, J. H., Halperin, H. R., & Berg, R. A. (2010). Rhythms and outcomes of adult in-hospital cardiac arrest. Critical Care Medicine, 38(1), 101–108. Pe´rez-Farino´s, N., Lo´pez-Abente, G., & Pastor-Barriuso, G. (2006). Open access time trend and age-period-cohort effect on kidney cancer mortality in Europe. BMC Public Health, 6, 119–126. Skinner, J. E., Anchine, J. M., & Weiss, D. N. (2008). Nonlinear analysis of the heartbeats in public patient ECGs using an automated PD2i algorithm for risk stratification of arrhythmic death. Therapeutics and Clinical Risk Management, 4(2), 549–557. Skinner, J. E., Meyer, M., Nester, B. A., Geary, U., Taggart, P., Mangione, A., Ramalanjaona, G., Terregino, C., & Dalsey, W. C. (2009). Comparison of linear–stochastic and
238
VIRGINIA M. MIORI AND DANIEL J. MIORI
nonlinear–deterministic algorithms in the analysis of 15-minute clinical ECGs to predict risk of arrhythmic death. Therapeutics and Clinical Risk Management, 5, 671–682. Tiainen, M., Parikka, H. J., Ma¨kija¨rvi, M. A., Takkunen, O. S., Sarna, S. J., & Roine, R. O. (2009). Arrhythmias and heart rate variability during and after therapeutic hypothermia for cardiac arrest. Critical Care Medicine, 37(2), 403–409. White, R. D., Blackwell, T. H., Russell, J. K., & Jorgenson, D. B. (2004). Body weight does not affect defibrillation, resuscitation, or survival in patients with out-of-hospital cardiac arrest treated with a nonescalating biphasic waveform defibrillator. Critical Care Medicine, 32(9), S387–S392. Whitney, S. N. (2003). A new model of medical decisions: Exploring the limits of shared decision making. Medical Decision Making, 23, 275–280. Zaroff, J. G., diTommaso, D. G., & Barron, H. V. (2002). A risk model derived from the National Registry of Myocardial Infarction database for predicting mortality after coronary artery bypass grafting during acute myocardial infarction. American Journal of Cardiology, 90(1), 35–38.