Spatial Econometrics: Methods and Applications (Studies in Empirical Economics)

Studies in Empirical Economics Studies in Empirical Economics Aman Ullah (Ed.) Semiparametric and Nonparametric Econom...

Author: Giuseppe Arbia | Badi H. Baltagi

65 downloads 930 Views 3MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Studies in Empirical Economics

Studies in Empirical Economics Aman Ullah (Ed.) Semiparametric and Nonparametric Econometrics 1989. ISBN 978-3-7908-0418-8

Thomas Url and Andreas Wörgötter (Eds.) Econometrics of Short and Unreliable Time Series 1995. ISBN 978-3-7908-0879-7

Walter Krämer (Ed.) Econometrics of Structural Change 1989. ISBN 978-3-7908-0432-4

Steven Durlauf, John F. Helliwell and Baldev Raj (Eds.) Long-Run Economic Growth 1996. ISBN 978-3-7908-0959-6

Wolfgang Franz (Ed.) Hysteresis Effects in Economic Models 1990. ISBN 978-3-7908-0482-9 John Piggott and John Whalley (Eds.) Applied General Equilibrium 1991. ISBN 978-3-7908-0530-7 Baldev Raj and Badi H. Baltagi (Eds.) Panel Data Analysis 1992. ISBN 978-3-7908-0593-2

Daniel J. Slortje and Baldev Raj (Eds.) Income Inequality Poverty and Economic Welfare 1998. ISBN 978-3-7908-1136-0 Robin Boadway and Baldev Raj (Eds.) Advances in Public Economics 2000. ISBN 978-3-7908-1283-1

Josef Christl The Unemployment/Vacancy Curve 1992. ISBN 978-3-7908-0625-0

Bernd Fitzenberger, Roger Koenker and Jos é A. E. Machado (Eds.) Economic Applications of Quantile Regression 2002. ISBN 978-3-7908-1448-4

Jürgen Kaehler and Peter Kugler (Eds.) Econometric Analysis of Financial Markets 1994. ISBN 978-3-7908-0740-0

James D. Hamilton and Baldev Raj (Eds.) Advances in Markov-Switching Models 2002. ISBN 978-3-7908-1515-3

Klaus F. Zimmermann (Ed.) Output and Employment Fluctuations 1994. ISBN 978-3-7908-0754-7

Badi H. Baltagi (Ed.) Panel Data 2004. ISBN 978-3-7908-0142-2

Jean-Marie Dufour and Baldev Raj (Eds.) New Developments In Time Series Econometrics 1994. ISBN 978-3-7908-0766-0

Luc Bauwens, Winfried Pohlmeier and David Veredas (Eds.) High Frequency Financial Econometrics 2008. ISBN 978-3-7908-1991-5

John D. Hey (Ed.) Experimental Economics 1994. ISBN 978-3-7908-0810-0

Christian Dustmann, Bernd Fitzenberger and Stephen Machin (Eds.) The Economics of Education and Training 2008. ISBN 978-3-7908-2021-8

Arno Riedl, Georg Winckler and Andreas Wörgötter (Eds.) Macroeconomic Policy Games 1995. ISBN 978-3-7908-0857-5

Giuseppe Arbia · Badi H. Baltagi (Eds.)

Spatial Econometrics Methods and Applications

Physica-Verlag A Springer Company

Editorial Board Heather M. Anderson Australian National University Canberra, Australia

Bernd Fitzenberger University of Freiburg Germany

Badi H. Baltagi Syracuse University Syracuse, New York, USA

Robert M. Kunst Institute for Advanced Studies Vienna, Austria

Editors

Professor Giuseppe Arbia University “G. d’Annunzio” of Chieti-Pescara Department of the Business, Statistical, Technological and Environment Sciences Viale Pindaro, 42 65127 Pescara Italy [email protected]

Professor Badi H. Baltagi Syracuse University Center for Policy Research 426 Eggers Hall Syracuse, NY 13244-1020 USA [email protected]

All papers have been first published in “Empirical Economics”

ISBN 978-3-7908-2069-0

e-ISBN 978-3-7908-2070-6

Library of Congress Control Number: 2008935139  Physica-Verlag Heidelberg 2009 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, roadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Physica-Verlag. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Cover Design: WMXDesign GmbH, Heidelberg, Germany Printed on acid-free paper 9 8 7 6 5 4 3 2 1 springer.com

Contents

1.

Introduction ......................................................................................................... 1 Badi H. Baltagi and Giuseppe Arbia

2

Errors in Variables and Spatial Effects in Hedonic House Price Models of Ambient Air Quality .......................................................................... 5 Luc Anselin and Nancy Lozano-Gracia

3.

A Generalized Method of Moments Estimator for a Spatial Model with Moving Average Errors, with Application to Real Estate Prices.............. 35 Bernard Fingleton

4.

Spatial Analysis of Urban Growth in Spain, 1900–2001 .................................. 59 Julie Le Gallo and Coro Chasco

5.

A Class of Spatial Econometric Methods in the Empirical Analysis of Clusters of Firms in the Space........................................................................... 81 Giuseppe Arbia, Giuseppe Espa, and Danny Quah

6.

A Spatially-Filtered Mixture of -Convergence Regressions for European Regions, 1980–2002........................................................................ 105 Michele Battisti and Gianfranco Di Vaio

7.

Spatial Shift-Share Analysis Versus Spatial Filtering: An Application to Spanish Employment Data ................................................ 123 Matías Mayor and Ana Jesús López

8.

R&D Spillovers and Firms’ Performance in Italy: Evidence from a Flexible Production Function............................................... 143 Francesco Aiello and Paola Cardamone

vi

Contents

9.

The Impact of Decentralization and Inter-Territorial Interactions on Spanish Health Expenditure ................................................... 167 Joan Costa-Font and Francesco Mosconey

10. Regional Evidence on Financial Development, Finance Term Structure and Growth............................................................................. 185 Andrea Vaona 11. Convergence in Per-Capita GDP Across European Regions: A Reappraisal .................................................................................................. 203 Valentina Meliciani and Franco Peracchi 12. Locational Choice and Price Competition: Some Empirical Results for the Austrian Retail Gasoline Market ............................................ 223 Gerhard Clemenz and Klaus Gugler 13. Dynamic Spatial Modelling of Regional Convergence Processes .................. 245 Reinhold Kosfeld and Jorgen Lauridsen 14. Spatial and Supply/Demand Agglomeration Economies: State- and Industry-Linkages in the U.S. Food System .................................. 263 Jeffrey P. Cohen and Catherine J. Morrison Paul

Introduction Badi H. Baltagi · Giuseppe Arbia

Most of the papers appearing in this book also appeared in a special issue of Empirical Economics on spatial econometrics. These papers were solicited from the International Workshop on Spatial Econometrics and Statistics held at LUISS “Guido Carli University” in Rome, Italy, 25–27 May, 2006. This conference also saw the birth of the Spatial Econometric Association. The Association’s aim is to promote the development of theoretical tools and sound applications of the discipline of spatial econometrics, including spatial statistics and spatial data analysis. Spatial econometrics should be viewed in a wide sense involving developments of models and statistical tools for the analysis of externalities, spillovers, interactions etc., in various areas including economics, geography and regional science, etc. (from the By-laws of the Association). In addition, we include four papers on spatial econometrics that appeared in regular issues of Empirical Economics over the period 2003–2006. This book includes methodology papers, see Fingelton’s paper which generalizes the GMM estimator proposed by Kelejian and Prucha (1999) for the spatial regression model with autoregressive errors (SARAR) to a spatial autoregressive model with moving average errors (SARMA). Also, Arbia, Espa and Quah who provide new statistical tools to study the complex interaction between spatial concentration, regional growth and knowledge spillovers. This book also includes applications of spatial econometrics to (1) the valuation of the effect of improved air quality through the

B. H. Baltagi (B) Center for Policy Research, Syracuse University, 426 Eggers Hall, Syracuse, NY 13244-1020, USA e-mail: [email protected] G. Arbia Department of the Business, Statistical, Technological and Environmental Sciences, University “G. d’Annunzio” of Chieti–Pescara, Viale Pindaro 42, 65127 Pescara, Italy e-mail: [email protected]

2

B. H. Baltagi, G. Arbia

estimation of hedonic models of house prices, see Anselin and Lozano-Gracia, (2) the evolution of population growth in Spain, see Le Gallo and Chasco, (3) the -convergence model across EU regions, see Battisti and Di Vaio, convergence in per-capita GDP across European regions, see Meliciani and Peracchi, and conditional income and productivity convergence across labour market regions in unified Germany, see Kosfeld and Lauridsen, (4) the evolution of regional employment in Spain, see Mayor and López, (5) the impact of R&D spillovers on production in Italian manufacturing, see Aiello and Cardamone, (6) the economic determinants of health care activity in Spain, see Costa-Font and Moscone, (7) the finance-growth nexus in Italy, see Vaona, (8) the Austrian retail gasoline market, see Clemenz and Gugler, and (9) supply/demand agglomeration economies in the U.S. food system, see Cohen and Morrison Paul.

Summary of the Contributions Anselin and Lozano–Gracia consider the valuation of the effect of improved air quality through the estimation of hedonic models of house prices. Since the potential errors in variables aspect of the interpolated air pollution measures is often ignored, this paper assesses the extent to which this may affect the resulting empirical estimates for marginal willingness to pay (MWTP). It uses an extensive sample of over 100,000 individual house sales for 1999 in the South Coast Air Quality Management District of Southern California. This paper takes into account of spatial dependence and endogeneity. It also accounts for both spatial autocorrelation and heteroskedasticity in the error terms, using the Kelejian–Prucha HAC estimator. Their results are consistent across different spatial weights matrices and different kernel functions and suggest that the bias from ignoring the endogeneity in interpolated values may be substantial. Fingleton generalizes the GMM estimator proposed by Kelejian and Prucha (1999) for the spatial regression model with autoregressive errors (SARAR) to a spatial autoregressive model with moving average errors (SARMA). Monte Carlo experiments are performed which suggest that the GMM estimates behave well and are robust to non-normality. Fingleton also suggests the Bootstrap method as a way of testing the significance of the moving average parameter. The SARMA model is applied to English real estate price data, providing evidence that price levels depend on the level of income both locally and within commuting distance, on local school quality, and on the stock of properties within the area. Fingleton finds a significant effect attributable to the spatial lag of prices, which is interpreted as the net outcome of the displacement of demand and of supply between areas. The moving average error process represents spatially autocorrelated unmodelled variables. Le Gallo and Chasco study the evolution of population growth in Spain using a group of 722 municipalities over the period 1900–2001. In particular, they obtain non-parametric kernel density estimates of the urban population distribution for each decade and analyze its monomodality or multimodality characteristics. They estimate a spatial SUR model for Zipf’s law and show the existence of two main phases: divergence (1900–1980) and convergence (1980–2001). Zipf’s law, or the rank-size rule, claims that the size distribution of cities follows a Pareto law. The density functions and Zipf’s law allow the characterization of the evolution of the global

Introduction

3

distribution, but they do not provide any information about the movements of the urban municipalities within this distribution. This is done by tracking the evolution of each urban municipality’s relative size over time and estimating transition probability matrices associated with discrete Markov chains. Spatial effects are introduced within the Markov chain framework using regional conditioning. This analysis shows a low interclass mobility, i.e. a high-persistence of urban municipalities to stay in their own class from one decade to another over the whole period, and the influence of the geographical environment on urban population dynamism. Studies of knowledge spillovers have received increasing importance in the literature on economic growth. In fact some theories explicitly link the presence of innovations to the growth of cities, seen as the places where the big concentration of individuals, firms and workers create positive externalities which, in turn, foster economic growth. A large part of the empirical literature concentrated on measuring the impact of technological spillovers on the innovation performances of regions. In many instances the number of patents and the relative citations have been used as proxies of the flow of knowledge and of the related innovative output. Arbia, Espa and Quah show the importance of distance based measures of spatial concentration in tackling this important emerging research area. They provide new statistical tools to study the complex interaction between spatial concentration, regional growth and knowledge spillovers. The empirical application involves a study of the inter– sectoral location of innovation in Italy based on the European Patent Office (EPO) dataset which records all patent applications in Europe. They are able to identify some distinctive joint patterns of location between patents of different sectors and to propose some possible economic interpretations. Battisti and Di Vaio apply a spatially filtered mixture regression approach to the -convergence model across EU regions over the period 1980–2002. Their results indicate that spatial effects matter, and that absolute, conditional, or club convergence, are restrictive assumptions when applied to the whole sample. Excluding a small number of regions that behave as outliers, only a few regions show an appreciable rate of convergence. The majority of data show slow convergence much lower than 2%, or no convergence at all. Mayor and López analyze the influence of spatial effects in the evolution of regional employment in Spain. Two non-parametric techniques are used: spatial shift-share analysis and spatial filtering. Advantages and limitations of each of these procedures are discussed, along with their sensitivity with regards to the considered weights matrix. Aiello and Cardamone study the impact of R&D spillovers on production for a balanced panel of 1,203 Italian manufacturing firms over the period 1998–2003. Estimation is based on a translog production function augmented by a measure of R&D spillovers that combines the geographical distance and the technological similarity within each pair of firms. In particular, they calculate the R&D spillovers as the weighted sum of indirect stock R&D capital. The weighting scheme uses an index of similarity for each pair of firms. The hypothesis is that the more similar the two firms are, the greater the flow of innovation between them. The paper finds that the contribution of R&D spillovers to firms’ production is positive. This is robust to

4

B. H. Baltagi, G. Arbia

the weighting scheme of knowledge transmission and the sample of firms used. They also show that geographical proximity matters. Costa-Font and Moscone examine the influence of a set of institutional, political and economic determinants of health care activity in Spain. This is done using panel data on Spanish health care expenditure at the regional level over the period 1995– 2002. Results are consistent with some degree of interdependence between neighboring regions in spending decisions. Spatial interactions among regions seem to play a role in explaining total expenditure and its major categories (pharmaceutical, inpatient, and ambulatory). Empirical evidence of long term efficiency effects of health care decentralization suggests that a specific spatialinstitutional design might improve the health system efficiency as well as regional cohesion. Vaona considers the finance-growth nexus and offers regional evidence on this issue by using cross-sectional and panel data sets of respectively 94 and 73 Italian provinces (NUTS3 regions). Vaona argues that finance leads growth, and is robust to spatial unobserved heterogeneity. Economic growth appears to be favoured by credit to private firms and more by short-term credit than by long-term credit. Meliciani and Peracchi study the convergence in per-capita GDP across European regions over the period 1980–2000. They use median unbiased estimators of the rate of convergence to the steady-state growth path, while allowing for unrestricted patterns of heterogeneity and spatial correlation across regions. Their results differ from those found using conventional estimators in that the mean rate of convergence is much lower, and for most regions this rate is zero. Also, the number of regions for which they reject equality in trend growth rates is substantially lower. They also find significant evidence of correlation of growth rates across neighboring regions and across regions belonging to the same country. Clemenz and Gugler use data from the Austrian retail gasoline market to find that a higher station density reduces average prices. Also, that market (i.e. ownership) concentration does not significantly affect average price, and is negatively related to the density of stations. They argue that the spatial dimension of markets allows the identification of market conduct, which is particularly relevant for competition policy. Kosfeld and Lauridsen introduce a dynamic spatial modelling approach which is suitable to trace regional adjustment processes in space instead of time. It is shown how the spatial error-correction mechanism (SEC model) can be estimated depending on the spatial stationarity properties of the variables under investigation. This is applied to the issue of conditional income and productivity convergence across labour market regions in unified Germany. Cohen and Morrison Paul study cost-impacts of spatial and industrial spillovers on economic performance. These are evaluated by incorporating activity level measures for nearby states and related industries into a cost function model. The focus is on localization and urbanization economies for state level food processing industries, from activity levels of similar industries in neighboring states, agricultural input suppliers, and final product demand. They find significant costsavings from proximity to other food manufacturing centers, and areas with high purchasing power.

Errors in variables and spatial effects in hedonic house price models of ambient air quality Luc Anselin · Nancy Lozano-Gracia

Abstract In the valuation of the effect of improved air quality through the estimation of hedonic models of house prices, the potential “errors in variables” aspect of the interpolated air pollution measures is often ignored. In this paper, we assess the extent to which this may affect the resulting empirical estimates for marginal willingness to pay (MWTP), using an extensive sample of over 100,000 individual house sales for 1999 in the South Coast Air Quality Management District of Southern California. We

This paper is part of a joint research effort with James Murdoch (University of Texas, Dallas) and Mark Thayer (San Diego State University). Their valuable input is gratefully acknowledged. The research was supported in part by NSF Grant BCS-9978058 to the Center for Spatially Integrated Social Science (CSISS), and by NSF/EPA Grant SES-0084213. Earlier versions were presented at the 5th International Workshop on Spatial Econometrics and Statistics, Rome, Italy, May 2006, the 53th North American Meetings of the Regional Science Association International, Toronto, ON, Nov. 2006, the 2007 Meetings of the Allied Social Science Assocations, Chicago, IL, Jan 2007, and at departmental seminars at the University of Illinois. Comments by discussants and participants are greatly appreciated. A special thanks to Harry Kelejian for his detailed and patient clarification of the HAC estimator. The usual disclaimer holds. L. Anselin (B) School of Geographical Sciences, Arizona State University, Tempe, AZ 85287-0104, USA e-mail: [email protected] N. Lozano-Gracia Spatial Analysis Laboratory (SAL) and Department of Agricultural and Consumer Economics, University of Illinois, Urbana-Champaign, Urbana, IL 61801, USA e-mail: [email protected] Present Address: N. Lozano-Gracia School of Geographical Sciences, Arizona State University, Tempe, AZ 85287-0104, USA

6

L. Anselin, N. Lozano-Gracia

take an explicit spatial econometric perspective and account for spatial dependence and endogeneity using recently developed Spatial 2SLS estimation methods. We also account for both spatial autocorrelation and heteroskedasticity in the error terms, using the Kelejian–Prucha HAC estimator. Our results are consistent across different spatial weights matrices and different kernel functions and suggest that the bias from ignoring the endogeneity in interpolated values may be substantial. Keywords Spatial econometrics · Hedonic models · HAC estimation · Endogeneity · Air quality valuation · Real estate markets JEL Classification C21 · Q51 · Q53 · R31

1 Introduction An important aspect of assessing the effectiveness of environmental policies that address the improvement of air quality is obtaining a quantitative measure of the economic value of the accrued benefits (e.g., Freeman 2003). In the absence of an explicit market for clean air, several methods have been suggested to estimate this value empirically, such as contingent valuation, conjoint analysis, discrete choice models and hedonic specifications. In this paper, we focus on the latter and consider some methodological issues associated with the estimation of an implicit price for clean air by including one or more pollution variables in a hedonic model of house prices. The rationale behind this approach is that, ceteris paribus, houses in areas with cleaner air will have this benefit capitalized into their value, which should be reflected in a higher sales price. The hedonic approach has become an established methodology in environmental economics (e.g., Palmquist 1991). Originating with the classic studies of Ridker and Henning (1967) and Harrison and Rubinfeld (1978), it has generated a voluminous literature dealing with theoretical, methodological and empirical aspects. Extensive reviews are provided in Smith and Huang (1993, 1995), Boyle and Kiel (2001), and Chay and Greenstone (2005), among others. Recently, empirical econometric work has started to take into account the potential bias and loss of efficiency that can result when spatial effects such as spatial autocorrelation and spatial heterogeneity are ignored in the estimation process. Spatial econometric methods (Anselin 1988), which incorporate the spatial dependence in cross-sectional data into model specification, estimation and testing have become fairly commonplace in empirical studies of housing and real estate, leading to so-called spatial hedonic models. Reviews of the basic specifications and estimation methods are provided in Anselin (1998), Basu and Thibodeau (1998), Pace et al. (1998), Dubin et al. (1999), Gillen et al. (2001), and Pace and LeSage (2004), among others. In the context of the valuation of environmental amenities, a spatial hedonic approach has been less common, although some recent applications include Kim et al. (2003), Beron et al. (2004), Brasington and Hite (2005), and Anselin and Le Gallo (2006). A theoretical perspective is offered in Small and Steimetz (2006).

Errors in variables and spatial effects in hedonic house price models of ambient air quality

7

In Chay and Greenstone (2005) (CG), several methodological issues are addressed pertaining to the identification and consistent estimation of the implicit price of air quality, using total suspended particulates as an environmental indicator. Specifically, CG focus on the potential endogeneity of the pollution variable and suggest an instrumental variable approach to estimate it consistently. They also consider potential endogeneity due to sorting by house purchasers when there is heterogeneity in their preference functions with different pollution levels. While considerable care is taken in addressing these specification problems, the model itself is estimated at a fairly aggregate spatial scale of US counties. Bayer et al. (2006) follow Chay and Greenstone (2005) by suggesting the possibility of local air pollution being correlated with unobserved local characteristics. They address this form of endogeneity by using the contribution of distant sources to local air pollution as an instrument for air pollution at the county level. In this paper, we focus on a separate source of endogeneity of the air quality variables in the hedonic specification. We elaborate on an idea outlined in Anselin (2001c), where it was argued that the use of spatially interpolated values for air quality (or, pollution) results in a prediction error which may be correlated with the overall model disturbance term. This would lead to simultaneity bias in an ordinary least squares regression. We thus consider the treatment of endogeneity in the pollution variable from the particular perspective of an “errors in variables” problem. We use polynomials in the coordinates of the house locations as instruments to correct for this endogeneity. In contrast to the aggregate approach of CG, our empirical work is based on observations for individual house transactions.1 Consequently, we face the mismatch between the spatial support of the explanatory variable, a pollution measure collected at a finite set of monitoring stations, and the dependent variable, the price observed at the location of the house sales transaction. As outlined in Anselin and Le Gallo (2006), this requires a spatial interpolation operation. Several alternatives are possible, each with implications for the precision of the resulting variable. We take an explicit spatial econometric approach and include a spatially lagged dependent variable (spatial lag) in the hedonic specification. The combination of the endogeneity of the spatial lag and the air quality variables requires the application of spatial two stage least squares estimation (Anselin 1988; Kelejian and Robinson 1993; Kelejian and Prucha 1998; Lee 2003, 2006) and specialized test statistics (Anselin and Kelejian 1997). In addition, we allow for remaining spatial autocorrelation and heteroskedasticity of an unspecified nature (HAC) and obtain robust standard error estimates using the method of Kelejian and Prucha (2006a). We believe ours is the first true empirical application of spatial hedonic models in which both types of endogeneity (spatial and non-spatial) are considered jointly and that uses the HAC standard errors.

1 CG also employ a panel data set with observations at two points in time, whereas our sample is a pure

cross-section. CG do not consider spatial effects. In our work, we do not explicitly consider endogeneity due to sorting. However, from an empirical point of view, the source of the endogeneity is irrelevant once it is properly accounted for.

8

L. Anselin, N. Lozano-Gracia

We assess the extent to which the selection of a particular method affects the parameter estimates in the hedonic function and the derived economic valuation of willingness to pay (MWTP) for improved air quality. Specifically, we compare nonspatial to spatial hedonic specifications and estimation with and without instruments for the endogeneity of the air quality variable. We further assess the robustness of our findings by carrying out estimation for different spatial weights and different kernel functions. We pursue this empirical assessment by means of an investigation of a sample of 115,732 house sales in the South Coast Air Quality Management District of Southern California, for which we have detailed characteristics, as well as neighborhood measures and observations on ozone and particulate matter.2 In the remainder of the paper, we first provide a brief discussion of data sources and variables included in the model. We next give some methodological background on the spatial econometric estimators and test statistics used. This is followed by a review of the estimation results, with a special focus on the estimates of the parameters of the air quality variables. In a brief discussion of policy implications, we compare the estimates for marginal willingness to pay. We close with some concluding remarks. 2 Data and variables The basic data used in this paper come from three main sources: Experian Company (formerly TRW) for the individual house sales price and characteristics, the 2000 US Census of Population and Housing for the neighborhood characteristics (at the census tract and block group level), and the South Coast Air Quality Management District for the measures of ozone (OZ) and particulate matter (TSP) concentration. The house price and characteristics are from 115,729 sales transactions of owner-occupied single family homes that occurred during 1999 in the region, which covers four counties: Los Angeles (LA), Riverside (RI), San Bernardino (SB) and Orange (OR). The data were geocoded, which allows for the assignment of each house to any spatially aggregate administrative district (such as a census tract, block group or a school district) and for the computation of accessibility measures and interpolated pollution values for the location of each individual house in the sample. House price and characteristics are matched with neighborhood and locational characteristics at the census tract, and, where possible, at the block group level from the 2000 U.S. Census of Population and Housing.3 The variables used in the hedonic specification are essentially the same as those in earlier work by Beron et al. (2004) and Anselin and Le Gallo (2006). This base set is extended with newly computed measures on crime rates, school quality, distance 2 Other studies of the relation between house prices and air quality in this region can be found in Graves

et al. (1988), Beron et al. (1999, 2001, 2004), and Anselin and Le Gallo (2006), although only the latter two take an explicit spatial econometric approach. Also of interest is a general equilibrium analysis of ozone abatement in the same region, using a hierarchical locational equilibrium model, outlined in Smith et al. (2004). 3 We assume that the values obtained for the 2000 Census are representative of the spatial distribution in

1999.

Errors in variables and spatial effects in hedonic house price models of ambient air quality

9

Table 1 Variable names and description Variable name

Description

Elevation

Relative elevation of the house

Livarea

Interior living space (10,000 sq.m.)

Landarea

Lot size (1,000 sq.m.)

Baths

Number of bathrooms

Fireplace

Number of fireplaces

Pool

Indicator variable for swimming pool

Age

Age of the house (10 years)

AC

Indicator variable for central air conditioning

Heat

Indicator variable for central heating

Beach

Indicator variable for location less than 5 miles from beach

Avdistp

Average distance to parks in meters

Highway1

Indicator variable for location within a 0.25 km from a highway

Highway2

Indicator variable for location within 0.25–1 km from a highway

Traveltime

Average time to work in census tract (CT)

Poverty

% of population with income below the poverty level in CT

White

% of the population that is white in the census block group (BG)

Over65

% of the population older than 65 years in the census BG

College

% of population with college in the CT

Income

Median household income in BG (10,000 US$)

Vcrime

Violent crime rate for the city (or non urban county rate)

API

Average academic performance index for the school district

Riverside

Indicator variable for Riverside county

San Bern.

Indicator variable for San Bernardino county

Orange

Indicator variable for Orange county

OZ

Ozone measured in ppb

TSP

Total Suspended Particles in µ/m 3

to parks, and access to the highway system. All the variables used in the analysis are listed in Tables 1 and 2. We grouped the variables in the Table into five categories: house-specific characteristics from the Experian data set; location-specific characteristics, such as accessibility measures, computed from the house coordinates; neighborhood characteristics, obtained from the Census, supplemented with variables calculated from the FBI Uniform Crime Reports and the State of California Department of Education school performance scores; county dummies; and interpolated air pollution values. Five new variables are included in the current analysis that were not used in Anselin and Le Gallo (2006): Vcrime, API, Avdistp, Highway1 and Highway2. They were computed from different sources. Crime rates for violent crimes taking place during 1998 were obtained from the FBI Uniform Crime database. This measure is reported at the city as well as the county level. Where possible, we assigned the city level crime

10

L. Anselin, N. Lozano-Gracia

Table 2 Basic descriptive statistics for all variables Variable name

Mean

Std. Deviation

Min

Max

House price

243,346

210,000

20,000

5,345,455

Ln(house price)

12.213

0.571

9.900

15.490

Elevation

0.995

0.145

-4.000

6.588

Livarea

0.160

0.073

0.050

3.182

Landarea

8.900

19.072

0.8

2818.332

Baths

1.924

0.799

0.500

9.500

Fireplace

0.643

0.560

0

7

Pool

0.150

0.357

0

1

Age

4.287

2.023

0.1

10

AC

0.407

0.491

0

1

Heat

0.277

0.447

0

1

Beach

0.012

0.111

0

1

Pavdist

5.637

0.991

4.447

8.992

Highway1

0.091

0.288

0

1

Highway2

0.342

0.475

0

1

Traveltime

2.936

0.412

1.014

4.717

Poverty

0.120

0.091

0

0.670

White

0.570

0.221

0

1

Over65

0.105

0.059

0

0.868

College

0.259

0.176

0

0.800

Income

5.946

2.588

0

20.000

Vcrime

0.142

0.057

0.037

0.348

API

5.948

0.920

4.271

8.918

Riverside

0.056

0.230

0

1

San Bern.

0.118

0.323

0

1

Orange

0.172

0.378

0

1

OZ

8.111

1.838

4.717

13.467

TSP

82.101

14.435

54.729

121.240

rate to each house in the city. Where crime rates were not available at the city scale, we used the non-urban crime rate for the county in which the house is located. A measure of the average school quality is computed from the Academic Performance Index (API), published by the California Department of Education.4 This is the primary indicator used by the state to evaluate school performance. The API is an index calculated using both base and growth values of student rankings in the State Standardized tests. It is based on a scale from 200 to 1,000 with the target being 800.

4 http://www.cde.ca.gov/ta/ac/ap/.

Errors in variables and spatial effects in hedonic house price models of ambient air quality

11

The average 1999 API value for all schools in a school district is calculated and then assigned to all the houses in the district.5 We supplement the beach access variable with three other indicators of accessibility to amenities. First, we obtained the locations for each park in the four counties from the Geographic Names Information System website.6 For each house location, we then computed the average distance to parks as a summary measure. We also supplemented the Census travel time measure with two other indicators of access to the highway system. These are intended to capture both the negative externalities (such as noise) experienced from being very close to the highways, as well as positive externalities due to shorter travel distances. We used ArcGIS and detailed highway maps7 to define buffers of 0.25 km around the highways and to create two indicator variables. The first takes the value of one if the house is within 0.25 km of a highway, the second takes the value of one if the house is between 0.25 and 1 km from a highway. Air quality is measured as ambient air pollution. In the literature, hedonic specifications typically include either ozone (OZ) or total suspended particulate matter (TSP) as pollutants, since these are most visible in the form of “smog.” In addition, local news outlets report daily measures of these pollutants and broadcast alerts when dangerous levels are reached. Consequently, it is reasonable to assume that these pollutants enter into the utility function of potential buyers, although the question remains to what extent a continuous measure of air quality is the appropriate metric.8 We include both pollutants in the specification, in order to minimize omitted variable problems.9 We use the average of the daily maxima during the worst quarter of 1998 from the hourly observations recorded at monitoring stations for ozone and suspended particles. It should be noted that the number and locations of stations in the South Coast Air Quality Management District (SCAQMD) is not the same for each pollutant. In 1998, there were measurements for OZ for 28 monitoring stations, while TSP only had 12. The location of the monitoring stations relative to the houses in the sample is illustrated in Fig. 1. This yields a reasonable coverage of the spatial distribution of house locations for OZ, but much less so for TSP, which has fewer than half the number of stations. We interpolate the values at the monitoring stations to the location of every house in the sample using ordinary kriging. Anselin and Le Gallo (2006), find ordinary kriging to be the most reliable among several interpolation methods, including Thiessen

5 It would have been preferable to use a measure of school quality from the year previous to the year in

which the house sale takes place, as we do for the air quality measures. However, information for the API in California school districts is only available starting in 1999. 6 http://geonames.usgs.gov/pls/gnispublic/. 7 ESRI Data & Maps CD-ROM (2002). Redlands, CA, USA: Environmental Systems Research Institute. 8 In Anselin and Le Gallo (2006) discrete categories were also considered. In the current paper, our focus

is on endogeneity and we leave the issue of the proper metric for a separate analysis. 9 We also ran the analysis for specifications with only one pollutant in the equations and the results and

conclusions were qualitatively similar to what we found here. Detailed results are not reported, but available from the authors.

12

L. Anselin, N. Lozano-Gracia

Fig. 1 Spatial distribution of houses and location of monitoring stations

polygons, inverse distance weighting and splines. Figures 2 and 3 show the resulting interpolated values of ozone and particles, with darker color representing higher levels of the pollutant.10 The spatial pattern is very different for the two measures of air pollution. For ozone, lower levels are observed closer to the ocean and air quality seems to worsen as one moves North-East with a suggestion of separate air quality “bands.” For TSP, generally lower pollution is observed in the North-West corner of the Basin, with increasing levels as one moves towards the South-East. The precision of the interpolated value varies across the sample, becoming worse for locations further removed from monitoring sites. To correct for a possible biasing effect of such “high-error” interpolated values, the house locations within the upper 5% of the prediction error distribution for either pollutant were dropped from the sample. This resulted in a final set of 103,867 house locations, of which 67,864 are in LA county, 17,914 in OR county, 12,266 in SB and 5,823 in Riverside county. The observed sales price ranges from $20,000 to $5,345,455, with an overall mean of $243,346. There is considerable variability across counties. For example, the average house price for observations in LA county is $ 261,946, while it is $269,081 in OR, $148,948 in SB and $146,249 in RI. Figure 4 illustrates the spatial distribution of house prices, with higher prices represented through darker colors. Some concentration of high prices per squared meter can be seen in the coast of LA and OR, although overall, 10 Kriging interpolations were carried out using the ESRI ArcGIS Geostatistical Analyst extension. A

spherical model allowing for directional effects was used for both pollutants. For OZ the model chosen included 8 lags with a lag size of 9 km, and the estimated parameters were 303.4 and 9 for the direction (angle), 4.16 for the partial sill, 68,604 and 68,236 for the major ranges and 59,381 and 68,236 for the minor ranges. The model chosen for TSP included 9 lags with a lag size of 6km, and the estimated parameters were 352.8 and 9 for the direction, 546.84 for the partial sill, 50,969 and 50,959 for the major ranges and 11,303 and 50,959 for the minor ranges.

Errors in variables and spatial effects in hedonic house price models of ambient air quality

13

Fig. 2 Kriging interpolation: OZ

Fig. 3 Kriging interpolation: TSP

there is considerable complexity in the spatial distribution of prices. Basic descriptive statistics for all the variables included in the analysis are given in Table 2. 3 Spatial econometric issues We estimate a hedonic function in log-linear form and take an explicit spatial econometric approach. This includes testing for the presence of spatial autocorrelation and

14

L. Anselin, N. Lozano-Gracia

Fig. 4 Spatial distribution of house prices (Price/sq.m.)

estimating specifications that incorporate spatial dependence.11 We follow Anselin (1988) and distinguish between spatial dependence in the form of a spatially lagged dependent variable, and a model with a spatially correlated error term. We refer to these as spatial lag and spatial error models, respectively. Formally, a spatial lag model is expressed as: y = ρW y + Xβ + u,

(1)

where y is a n × 1 vector of observations on the dependent variable, X is a n × k matrix of observations on explanatory variables, W is a n × n spatial weights matrix, u a n × 1 vector of i.i.d. error terms, ρ the spatial autoregressive coefficient, and β a k × 1 vector of regression coefficients. The theoretical motivation for a spatial lag specification is based on the literature on interacting agents and social interaction. For example, a spatial lag follows as the equilibrium solution of a spatial reaction function (Brueckner 2003) that includes the decision variable of other agents in the determination of the decision variable of an agent (see also Manski 2000). In the current setting, which is purely cross-sectional, it is difficult to maintain such a theoretical motivation, since it would imply that buyers and sellers simultaneously take into account prices obtained in other transactions. An alternative interpretation is provided by focusing on the reduced form of the spatial lag model: (2) y = (I − ρW )−1 Xβ + (I − ρW )−1 u, 11 For a general overview of methodological issues involved in the specification, estimation and diagnostic

testing of spatial econometric models, we refer to Anselin (1988, 2001b, 2006) and Anselin and Bera (1998), among others.

Errors in variables and spatial effects in hedonic house price models of ambient air quality

15

where, under standard regularity conditions, the inverse (I − ρW )−1 can be expressed as a power expansion (I − ρW )−1 = I + ρW + ρ 2 W 2 + · · · .

(3)

The reduced form thus expresses the house price as a function of the own characteristics (X ), but also of the characteristics of neighboring properties (W X , W 2 X ), albeit subject to a distance decay operator (the combined effect of powering the spatial autoregressive parameter and the spatial weights matrix). In addition, omitted variables, both property-specific as well as related to neighboring properties are encompassed in the error term. In essence, this reflects a scale mismatch between the property location and the spatial scale of the attributes that enter into the determination of the equilibrium price. From a purely empirical perspective, one can also argue that the spatial lag specification allows for a filtering of a strong spatial trend (similar to detrending in the time domain), i.e., to ensure the proper inference for the β coefficients when there is insufficient variability across space. Formally, the spatial filter interpretation stresses the estimation of β in: y − ρW y = Xβ + u.

(4)

In contrast, spatial error autocorrelation results when omitted variables follow a spatial structure such that the error variance-covariance matrix is no longer diagonal: Var[uu ′ ] = E[uu ′ ] = ,

(5)

where = I, with I as the identity matrix. Arguably, such spatially structured omitted variables may be addressed by means of spatial fixed effects, e.g., by including a dummy variable for each census tract or block group. This rests on the assumption that the spatial range of the unobserved heterogeneity/dependence is specific to each spatially delineated unit. In practice, there may be spatial units (such as school districts) where such a spatial fixed effects approach is sufficient to correct for the problem. However, the nature of omitted neighborhood variables tends to be complex, as is the definition of the correct “neighborhood.” Instead of including spatial fixed effects, we assume a process for the error terms that allows the externalities to spill over throughout the system. More specifically, in contrast to most earlier work, we do not impose a specific functional form, but take a non-parametric perspective, implementing the recent results of Kelejian and Prucha (2006a). By means of the spatial weights matrix W , a neighbor set is specified for each location. The positive elements wi j of W are non-zero when observations i and j are neighbors, and zero otherwise. By convention, self-neighbors are excluded, such that the diagonal elements of W are zero. In addition, in practice, the weights matrix is typically row-standardized, such that j wi j = 1. Many different definitions of the neighbor relation are possible, and there is little formal guidance in the choice of the “correct” spatial weights.12 The term W y in Eq. (1) is referred to as a spatially lagged 12 For a more extensive discussion, see Anselin (2002, pp. 256–260), and Anselin (2006, pp. 909–910).

16

L. Anselin, N. Lozano-Gracia

dependent variable, or spatial lag. For a row-standardized weights matrix, it consists of a weighted average of the values of y in neighboring locations, with weights wi j . In our application, we consider three spatial weights to assess the sensitivity of the results to this important aspect of the model specification. One weight is derived from the contiguity relationship for Thiessen polygons constructed from the house locations. This effectively turns the spatial representation of the sample from points into polygons. The resulting weights matrix is symmetric and extremely sparse (0.006% non-zero weights). On average it contains 6 neighbors for each location (ranging from a minimum of 3 neighbors to a maximum of 35 neighbors, with 6 as the median). We supplement this with two weights based on a nearest neighbor relation among the locations, for respectively 6 and 12 neighbors. The corresponding weights matrix is asymmetric, but equally sparse (respectively 0.006 and 0.012% non-zero weights). The three weights matrices are used in row-standardized form. We first obtain ordinary least squares (OLS) estimates for the hedonic model and assess the presence of spatial autocorrelation using the Lagrange Multiplier test statistics for error and lag dependence (Anselin 1988), as well as their robust forms (Anselin et al. 1996).13 The results consistently show very strong evidence of positive residual spatial autocorrelation, with an edge in favor of the spatial lag alternative (see Sect. 4). This matches earlier results obtained in Anselin and Le Gallo (2006). We therefore focus on the estimation of the spatial lag model but allow remaining spatial error autocorrelation of unspecified form, as well as heteroskedasticity of unspecified form. Our paper takes two distinctive approaches towards estimation and inference of the spatial hedonic model that warrant further elaboration. First, we use a spatial twostage least squares estimator (S2SLS) that allows for a spatial lag as well as other endogenous variables. Consider the spatial lag model (1) with an additional term: y = ρW y + Y ν + Xβ + u,

(6)

where Y is a n × p matrix of endogenous variables, with associated coefficient vector ν. In our model, the endogenous variables are the air quality variables, say y2 and y3 . Since the actual pollution is not observed at the locations i of the house transactions, it is replaced by a spatially interpolated value, such as the result of a kriging prediction. This interpolated value measures the true pollution with error, for example, at location i: ∗ + ψi , y2i = y2i

(7)

∗ is the true air quality that enters into the agent’s utility function, y is the where y2i 2i “observed” value (the interpolated value), and ψi an error term. Note that this error is related to the interpolation error to the extent that the predicted item is also what enters into the utility function. An additional source of error would be a discrepancy between what is predicted as air quality and what is included into the agent’s utility function as

13 See Anselin (2001a), for an extensive review of statistical issues.

Errors in variables and spatial effects in hedonic house price models of ambient air quality

17

air quality.14 From a practical perspective, due to the nature of the kriging predictor, the prediction error will be highly spatially structured. We suggest that it therefore is likely to mimic the spatially structured equation disturbance u. In addition, the failure to predict air quality correctly at a location may be due to similar omitted variables as those that affect the error of the hedonic specification (e.g., the omitted presence of noxious facilities). As a result, it is likely that E[ψi u i ] = 0, causing simultaneous equation bias due to errors in variables. Using traditional notation, Eq. (6) can be rewritten concisely as: y = Z γ + u,

(8)

with Z = [W y, Y, X ] and γ = [ρ, ν ′ , β ′ ]′ . The spatial two stage least squares estimator is an extension of the standard two stage least squares estimator that includes specific instruments for the spatially lagged dependent variable (see Anselin 1980, 1988; Kelejian and Robinson 1993; Kelejian and Prucha 1998; Kelejian et al. 2004; Lee 2003, 2006). Specifically, consider the q × n matrix of instruments Q, with q ≥ k + p + 1: Q = [X, W X, H ],

(9)

where W X is a matrix consisting of the spatially lagged explanatory variables (exogenous variables only, and excluding the intercept), and H is a matrix of instruments for the other endogenous variables (the air quality variables). The use of W X as instruments for the spatial lag is based on the reduced form of the model. The selection of instruments for the errors in variables problem is less straightforward. Proper instruments should be correlated with the unobserved true pollution value y ∗ and uncorrelated with the regression error u. The effects on the estimates of using weak instruments have been widely discussed in the literature (see e.g., Staiger and Stock 1997) and the question of how to specify the right instruments remains unresolved for many economic problems. We chose instruments that are able to proxy the overall spatial pattern of the pollution as a global spatial trend. They therefore are unlikely to be correlated with the hedonic error terms, which reflect local spatial patterns of omitted variables. Specifically, we use the latitude, longitude and their product as the instruments. Note that these instruments may also aid in correcting endogeneity due to other factors, such as sorting. As long as they are uncorrelated with the error term, they will yield consistent estimates. However, if the instruments do not accurately capture the causal mechanism underlying the other sources of endogeneity, the resulting estimates will not be most efficient. This needs to be considered together with other sources of inefficiency, such as unobserved heterogeneity and spatial autocorrelation in the error term. In order for the asymptotic properties of the HAC estimator to hold, we only need consistency of the estimates in the first stage, which 14 An early application of instrumental variables in this context within the economic literature is Friedman

(1957), where a measurement problem appears when using annual income as a proxy for permanent income in estimating a consumption function.

18

L. Anselin, N. Lozano-Gracia

will be satisfied by our instruments (as long as they are uncorrelated with the error term). With the instrument matrix in hand, we obtain the S2SLS estimates as: γˆS2S L S = [Z ′ Q(Q ′ Q)−1 Q ′ Z ]−1 Z ′ Q(Q ′ Q)−1 Q ′ y.

(10)

Inference is based on the asymptotic variance matrix: AsyV ar [γˆS2S L S ] = σˆ 2 [Z ′ Q(Q ′ Q)−1 Q ′ Z ]−1 ,

(11)

with σˆ 2 = (y − Z γˆS2S L S )′ (y − Z γˆS2S L S )/n. We relax the assumption of homoskedasticity used in (11) and allow for heteroskedasticity of unspecified form. A direct application of the approach outlined in White (1980) yields an alternative estimate for the asymptotic variance matrix as: −1

′ Q) AsyV ar [γˆS2S L S−W ] = [Z ′ Q (Q

Q ′ Z ]−1 ,

(12)

−1

′ Q) with (Q = (Q ′ S Q)−1 , where S is a diagonal matrix containing the squared 15 S2SLS residuals. We also continue to test for remaining spatial error autocorrelation, using the generalized LM tests for 2SLS residuals (Anselin and Kelejian 1997). The second distinctive methodological aspect of our approach is that we allow for remaining spatial error autocorrelation of unspecified form. Since the specification tests indicate the presence of such autocorrelation (see Sect. 4), we apply the recently developed heteroskedastic and autocorrelation robust (HAC) approach of Kelejian and Prucha (2006a). This builds upon the framework outlined in Conley (1999) as an extension to the spatial domain of the well-known Newey and West (1987) result from time series analysis (see also Andrews 1991). The core of the HAC technique is a non-parametric estimator for the spatial covariance, using weighted averages of cross-products of residuals, the range of which is determined by a kernel function.16 Formally, we need to obtain an estimate of the matrix = Q ′ Q, where is a non-diagonal spatial variance–covariance matrix for the error terms. As Kelejian and Prucha (2006a) show, the estimator for the individual r, s elements of the matrix is given by:

r,s = (1/n) ψ

i

qir q js uˆ i uˆ j K (di j /d),

(13)

j

15 For a recent discussion of technical aspects associated with heteroskedastic robust estimation in spatial

models, see Kelejian and Prucha (2006b) and Lin and Lee (2005). 16 The origins of this approach can be found in Hall and Patil (1994).

Errors in variables and spatial effects in hedonic house price models of ambient air quality

19

where the subscripts refer to the individual elements of the matrix Q and residual vector u, ˆ and K is a kernel function.17 In the case of OLS, Q is replaced by X , the matrix of observations on the explanatory variables. The kernel function K ( ) determines which pairs i, j are included in the cross products in (13). The kernel function is a real, continuous and symmetric function that is bounded and integrates to one, similar to a probability density function.18 In the current context, the kernel is formulated as K (di j /d), where di j is the distance between i and j, and d is the bandwidth, such that K (di j /d) = 0 for di j ≥ d. In our application, we use three different kernel functions: the triangular or Bartlett kernel, with K (z) = 1 − z (with z = di j /d), the Epanechnikov kernel, with K (z) = 1 − z 2 , and the bisquare kernel, with K (z) = (1 − z 2 )2 . Note that for each of these K = 1 for di j = 0. We implement this using a variable bandwidth, based on the distances to the 40 nearest neighbors. Using the estimates for from (13), the HAC variance for the S2SLS estimates is obtained as: (Q ′ Q)−1 Q ′ Z (Z q′ Z q )−1 , AsyV ar [γˆS2S L S−H AC ] = (Z q′ Z q )−1 Z ′ Q(Q ′ Q)−1 (14) with Z q′ Z q = Z ′ Q(Q ′ Q)−1 Q ′ Z . One final methodological note pertains to the assessment of model fit. In spatial models, the use of the standard R 2 measure is not appropriate (see Anselin 1988, Chap. 14). In order to provide for an informal comparison of the fit of the various specifications, we report a pseudo-R 2 measure, computed as the ratio of the variance of the predicted value to the variance of the observed values. In the classical linear regression model, this is equivalent to the R 2 , but in the spatial models this measure should be interpreted with caution. In the spatial lag model, the spatially lagged dependent variable W y is endogenous. We therefore obtain the predicted value from the expression for the conditional expectation of the reduced form: yˆ = E[y|X ] = (I − ρW ˆ )−1 X βˆ

(15)

This operation requires the inverse of a matrix of dimension n × n, which we approximate by means of a power method, accurate up to 6 decimals of precision. 4 Estimation results We begin the review of our empirical results by focusing on the coefficients obtained using the four estimation methods under consideration: OLS, IV (standard nonspatial 2SLS with the pollutants treated as endogenous), LAG (spatial 2SLS with 17 In practice, the term (1/n) cancels out in the final expression for the variance matrix in (14). We include it here to be consistent with the notation in Kelejian and Prucha (2006a). 18 See, among others, Härdle (1990, Chap. 3), Andrews (1991, pp. 822–823), Simonoff (1996, Chap. 3),

and Cameron and Trivedi (2005, pp. 299–300).

20

L. Anselin, N. Lozano-Gracia

Table 3 Coefficient estimates: traditional hedonic variables— queen weights Variable name

OLS

IV

Constant

12.12

12.5281

8.1169

8.4700

0.0012

0.0009

0.0009

2.6057

2.5864

2.2326

2.2237

−0.0004∗

−0.0029∗

0.0027∗

0.0007∗

Landarea Livarea Elevation

0.0011

LAG

LAG-end

Baths

0.0471

0.04724

0.0415

0.0416

Fireplace

0.0457

0.0441

0.0363

0.0352

Pool

0.0505

0.0508

0.0438

0.0440

Age

−0.0166

−0.0197

−0.0130

−0.0153

Age2 AC

0.0190

0.0153

0.0140

0.0113

−0.0249

−0.0229

−0.0159

−0.0150

Heat

0.0386

0.0363

0.0245

0.0229

Beach

0.2405

0.2661

0.1719

0.1934

Distance Parks

−0.0287

−0.0395

−0.0213

−0.0298

Highway1

−0.0199

−0.0234

−0.0130

−0.0155

Highway2 Travel time Poverty

0.0028∗ −0.0649 0.0142∗

0.0027∗ −0.0579

0.0043∗∗ −0.0541

−0.0210∗

0.0201∗

White

0.3230

0.3179

0.2282

Over65

0.1125

0.0376∗∗

0.0327∗∗

0.0043∗∗ −0.0494 −0.0454 0.2253 −0.0213∗

College

1.0155

0.9192

0.5988

0.5342

Income

0.0212

0.0224

0.0085

0.0096

Vcrime

−0.3938

−0.2450

−0.2446

−0.1261

API

0.0007∗

0.0011∗

Riverside

−0.1405

−0.0025∗

San Bern.

−0.1411

−0.0652

Orange

−0.0077∗∗

R 2 (var ratio)

0.7761

0.0579∗∗ 0.7947

0.0007∗ −0.0977

0.0017∗ 0.0006∗

−0.0938

−0.0413

−0.0126

0.0370

0.7814

0.8017

∗ Not significant ∗∗ Significant at 5%

a spatially lagged dependent variable), and LAG-end (spatial 2SLS with a spatially lagged dependent variable and the pollutants treated as endogenous). We separate the results into those for the traditional hedonic variables, reported in Table 3, and those for the pollutant coefficients, reported in Table 4 together with some model diagnostics. The tables only contain results for the queen spatial weights (to create the spatially lagged dependent variable). The complete set of estimates for all three spatial weights is given in the Appendix. First, consider the OLS results. Overall, the coefficients of the house characteristics are significant and of the expected sign, in accordance with earlier findings in the literature. The only exception is relative elevation, which was not found to be significant. House prices increase as both land and living area increase. Similarly, houses with

Errors in variables and spatial effects in hedonic house price models of ambient air quality

21

Table 4 Pollutant coefficients by estimator—queen weights Variable Name

OLS

IV

LAG

LAG-end

OZ

−0.0253

−0.0137

−0.0195

−0.0099

TSP

−0.0047

−0.0102

−0.0032

−0.0073

ρ

−

−

0.3314

0.3266

RLM-LAG

2357.271

−

−

−

RLM-ERR

1339.671

−

-

−

DWH

2,540

−

−

−

18323.48

60.24

137.46

A-K

more bathrooms, fireplaces, as well as with AC and heating systems are higher valued. As the literature suggests (see among others Bourassa et al. 1999; Beron et al. 2004) there appears to be a quadratic relationship between age and price: prices are higher for more recently built houses. There is also a vintage effect of age on prices that is reflected in the positive sign of the quadratic term. In terms of access variables, there is a significant premium for houses that are located closer to the beach and closer to parks, but the effect of the immediate vicinity to the highway is that of a nuisance. Location in a zone 0.25–1 km from the highway is not significant (for OLS; it is positive and becomes significant at p < 0.05 in the spatial models). The results for the neighborhood variables are also in accordance with conventional wisdom: travel time and crime are negatively valued, whereas % white, the proportion of college graduates and median income have a positive effect. Poverty and the school quality score were not found to be significant. The percentage elderly is positive, but this finding is not stable across estimators (see below). Los Angeles county was used as the base case, which resulted in a negative value for the dummy variables for Riverside and San Bernardino, but no significant difference for Orange county. The overall fit is very satisfactory, with an R 2 of 0.78. However, as the model diagnostics indicate (Table 4), OLS suffers from a number of problems. First, the Durbin–Wu–Haussman test statistic for endogeneity strongly rejects the null hypothesis that the interpolated pollutants are exogenous. In addition, there is evidence of very high residual spatial autocorrelation, with the robust LM test statistic suggesting the lag specification as the proper alternative. We next consider the effect on the estimates for the traditional hedonic variables of treating the pollutants as endogenous (column IV in Table 3), including a spatially lagged dependent variable (column LAG), and combining both spatial lag and endogeneity of the pollutants (column LAG-end). Note that the A–K test for residual spatial autocorrelation also rejected the null for all three non-OLS cases, even after a spatially lagged dependent variable was included. The latter is highly significant, with estimates for the spatial autoregressive coefficient around 0.3. The A–K test points to the need to account for remaining spatial error autocorrelation through the HAC approach. The most appropriate specification is therefore the LAG-end with HAC

22

L. Anselin, N. Lozano-Gracia

variance estimates. The other results are provided to assess the effect of addressing endogeneity and spatial effects in isolation versus in combination. For the individual house characteristics and accessibility variables, the estimated coefficients remain fairly stable across methods, with only marginal changes. The estimates obtained with LAG-end are slightly smaller in absolute value, but all the significance remain the same. This is not the case for the estimates of the neighborhood characteristics. These vary considerably across methods, both in magnitude as well as in significance. For example, Poverty, which is not significant for OLS, IV and LAG, becomes significant and negative in the LAG-end model. In the reverse direction, the % elderly, which is significant in OLS, gradually loses significance (significant only at p < 0.05 for IV and LAG) to become insignificant in LAG-end. The absolute value of the coefficients for Income, College and Vcrime in LAG-end is less than half the magnitude for OLS. These variables are measured at an aggregate scale (census tract or block group, or city for the crime variable) and therefore the disturbances from the model may be correlated within the aggregation groups (Moulton 1990). It is likely that houses in the same census tract share unobservable characteristics leading to correlation in the error terms. We surmise that the inclusion of a spatially lagged dependent variable filters out some of this error and yields more accurate estimates. The pollution variables are similarly affected by the estimation method. Both coefficients of Ozone and TSP are negative and highly significant throughout. However, their absolute value varies considerably across methods. Taken individually, the effect of controlling for endogeneity seems to be strongest, resulting in a change between OLS and IV of −0.025 to −0.014 for Ozone, and of −0.005 to −0.010 for TSP. Between OLS and LAG, the change is much smaller. In LAG-end, accounting for both the spatial effects and the endogeneity yields a coefficient of −0.0099 for Ozone and −0.0073 for TSP. This suggests that a reduction of 1 ppb in OZ levels would raise house prices by 0.99% and a decrease of 1 µ/m 3 in TSP values would increase house values by 0.73%. Since the joint consideration of spatial effects and endogeneity is new in the current paper, there are no results available in the literature to compare our findings to directly. However, our OLS estimates are in line with previous published results. For example, in a meta-analysis of 37 studies, Smith and Huang (1995) suggest that a decrease of 1 µ/m 3 in the TSP values will result in an increase of house values ranging between 0.05 and 0.10%. Using an IV estimator Chay and Greenstone (2005) estimate that a change in 1 µ/m 3 will produce a 0.2–0.4% change in house prices in the opposite direction. These estimates are considerably lower than those obtained in the current study, but it is important to keep in mind that their results are obtained for county aggregates. The OLS results in Beron et al. (2001) suggest that a decrease in one ppb of OZ would produce an increase in house prices ranging from 2.3 to 7.1%, which is consistent with our OLS estimates. Relative to OLS, when accounting for both endogeneity and spatial autocorrelation in the LAG-end model, the effect of ozone on house prices appears to be significantly smaller in absolute terms, while the effect of TSP is larger in absolute value. As shown in Table 4, the A–K test in the LAG-end model still shows significant remaining spatial error autocorrelation. We assess the effect of this on the precision of the estimates for both pollutants by computing three sets of standard errors: classical,

Errors in variables and spatial effects in hedonic house price models of ambient air quality

23

Table 5 Standard errors: OZ Coeff.

Standard errors

OZ

Classical

White

HAC-Ep

HAC-Tr

HAC-Bi

OLS

−0.0253

0.0008

0.0008

0.0018

0.0016

0.0016

IV

−0.0137

0.0010

0.0011

0.0026

0.0023

0.0024

−0.0195

0.0007

0.0008

0.0012

0.0011

0.0011

−0.0099

0.00099

0.0010

0.0017

0.0016

0.0016

−0.01822

0.00078

0.00087

0.00125

0.00115

0.00115

−0.00895

0.00099

0.00108

0.00175

0.00157

0.00158

−0.01802

0.00078

0.00086

0.00123

0.00113

0.00114

−0.00853

0.00098

0.00107

0.00170

0.00154

0.00155

LAG

Queen

LAG-end LAG

Knn6

LAG-end LAG

Knn12

LAG-end

Table 6 Standard errors: TSP Coeff.

Standard errors

TSP

Classical

White

HAC-Ep

HAC-Tr

HAC-Bi

OLS

−0.0047

0.00010

0.00010

0.00021

0.00019

0.00019

IV

−0.0102

0.00019

0.00019

0.00046

0.00041

0.00041

−0.0032

0.00010

0.00010

0.00010

0.00014

0.00014

−0.0073

0.00018

0.00020

0.00032

0.00029

0.00030

−0.0032

0.00009

0.00010

0.00015

0.00014

0.00014

−0.0073

0.00018

0.00020

0.00031

0.00028

0.00028

−0.0032

0.00009

0.00010

0.00015

0.00013

0.00014

−0.0073

0.00018

0.00020

0.00031

0.00028

0.00029

LAG

Queen

LAG−end LAG

Knn6

LAG−end LAG

Knn12

LAG−end

White (heteroskedastic consistent), and HAC. The results are reported in Tables 5 and 6, for the three spatial weights matrices and three kernel functions. The estimates for the pollution variables are essentially the same across the three spatial weights, with only a slight difference for ozone. However, accounting for remaining heteroskedasticity and spatial error correlation has a dramatic effect on the precision of the estimates. The standard errors are up to twice as large for the HAC as the classical and White results with consistently the largest value for the Epanechnikov kernel. By and large, the numerical values are essentially the same across kernels and spatial weights, which provides some evidence of the robustness of our findings. The more realistic measure of the standard errors of the estimates will be important in assessing the precision of the derived welfare measures, such as the MWTP, to which we turn next.

5 Policy analysis We conclude this empirical exercise by comparing the valuation of air quality computed from the parameter estimates obtained by the alternative methods. In a hedonic

24

L. Anselin, N. Lozano-Gracia

model, the implicit price of any characteristic may be obtained as the derivative of the hedonic price equilibrium equation with respect to the characteristic of interest. In a non-spatial log-linear model, the MWTP equals the estimated coefficient for the pollution variable times the price (P), or:19 ∂P = βˆg P, M W T Pg = ∂g

(16)

where g is either OZ or TSP. As shown in Kim et al. (2003), a spatial multiplier effect needs to be accounted for to accurately compute the MWTP in a spatial lag model. For a uniform change in the amenity across all observations the MWTP then follows as: M W T P = βˆg P(

1 ), 1−ρ

(17)

with ρ as the estimate of the spatial autoregressive coefficient. The distinction between (16) and (17) is important in light of the recent discussion by Small and Steimetz (2006). They considered the different interpretation of welfare effects between the direct valuation in (16) and the multiplier effect included in (17). In their view, the multiplier effect should only be considered as part of the welfare calculation in the case of a technological externality associated with a change in amenities. In the case of a purely pecuniary externality, the direct effect is the only correct measure of welfare change. A strong argument in favor of using a spatial lag specification (where warranted by the data) is that it allows the two effects to be considered explicitly. In Tables 7 and 8 we report the calculated MWTP for OZ and TSP for the four estimation methods. For the lag models, we include both the direct effect as well as the total effect. In addition to point estimates, we list a confidence band which consists of ± two standard errors around the point estimate. In the non-spatial models and for the direct effect computation, the standard errors are those reported for the regression coefficients. In the spatial multiplier, the standard error of both βˆ and ρˆ needs to be accounted for jointly, which we implement by means of the delta method (see e.g., for further details Greene 2003). We report the results for the three spatial weights and with standard errors based on the classic form, the White and the three HAC formulations. The MWTP are estimated for a change of 0.1 ppb for ozone and 1 µ/m 3 for particles which correspond to changes of 1.1% on average. For both pollutants, we note a striking difference between the OLS estimate and the result from the LAG-end model, but not in the same direction. For ozone, the OLS result would suggest a point estimate of $616 compared to $330–$358 as the range across spatial weights for the total effect for LAG-end, with $208–$241 as the range for the direct effect. For TSP, the direction of change is opposite, with an OLS result

19 In all cases we use the mean house price in the sample to calculate the MWTP.

Errors in variables and spatial effects in hedonic house price models of ambient air quality

25

Table 7 MWTP for reductions in OZ levels OZ

OLS

IV

LAG Direct

LAG With multiplier

LAG-end Direct

LAG-end With multiplier

Estimate

616

335

475

710

241

358

Classic

575–657

281–388

433–513

652–768

193–289

286–430

White

573–659

277–392

433–517

649–770

189–293

281–435

Queen

HAC-Ep

526–705

204–466

412–538

618–802

154–329

229–489

HAC-Tr

536–695

220–450

417–532

626–794

163–320

242–475

HAC-Bi

535–696

218–452

417–532

625–794

162–320

241–475 334

KNN-6 Estimates

–

–

444

684

218

Classic

–

–

405–4,818

625–743

170–266

260–408

White

–

–

401–486

623–770

165–271

257–435

HAC-Ep

–

–

382–505

592–776

133–303

204–465

HAC-Tr

–

–

387–500

600–768

141–295

218–451

HAC-Bi

–

–

387–500

600–767

141–295

217–452 330

KNN-12 Estimate

–

–

439

706

208

Classic

–

–

400–4768

644–768

160–255

254–406

White

–

–

397–481

649–770

155–260

281–435

HAC-Ep

–

–

379–499

592–776

125–291

204–465

HAC-Tr

–

–

383–494

620–793

132–283

211–450

HAC-Bi

–

–

383–494

619–793

132–283

210–450

of $1,148 contrasted with a range of $2,640–$2,713 for the total effect using LAGend, and $1,705–$1,778 as the range of the direct effect. Taking into account the standard errors, including the much wider ones suggested by the HAC estimates, we can characterize these differences as significant. Even though it is sometimes suggested that OLS results may be appropriate as estimates of the total effect, our findings do not support this.20 It is also interesting to note that the direction of the difference is opposite between the two pollutants, something earlier studies that included only a single pollutant were not able to ascertain. The main conclusion is therefore that OLS estimates are likely to be misleading, but not that they over- or underestimate the total effect in a specific direction. A closer consideration of the results seems to suggest that the primary difference is due to accounting for endogeneity, rather than the inclusion of the spatial lag.

20 This is separate from the issue that on technical grounds OLS will yield inconsistent and imprecise

estimates, due to the presence of both spatial autocorrelation and endogeneity.

26

L. Anselin, N. Lozano-Gracia

Table 8 MWTP for reductions in TSP levels TSP

OLS

IV

LAG Direct

LAG With multiplier

LAG-end Direct

LAG-end With multiplier

Estimate

1148

2489

783

1,170

1,778

2,640

Classic

1,097–1,198

2,392–2,585

733–832

1,096–1,245

1,687–1,868

2,511–2,769

White

1,099–1,197

2,394–2,585

732–833

1,099–1,242

1,678–1,877

2,511–2,769

HAC-Ep

1,042–1,253

2,261–2,716

733–832

1,105–1,236

1,620–1,935

2,419–2,861

HAC-Tr

1,054–1,241

2,289–2,689

712–853

1,071–1,270

1,634–1,922

2,439–2,841

HAC-Bi

1,053–1,243

2,285–2,693

712–853

1,070–1,271

1,632–1,924

2,437–2,844

Estimate

–

–

775

1,195

1,740

2,671

Classic

–

–

726–823

1,119–1,245

1,651–1,830

2,538–2,773

White

–

–

724–825

1,124–1,241

1,642–1,838

2,542–2,769

HAC-Ep

–

–

700–849

1,086–1,279

1,587–1,893

2,441–2,870

HAC-Tr

–

–

706–843

1,096–1,269

1,601–1,880

2,430–2,850

HAC-Bi

–

–

706–843

1,095–1,270

1,599–1,881

2,459–2,852

Queen

KNN-6

KNN-12 Estimate

–

–

743

1,196

1,705

2,713

Classic

–

–

694–791

1,117–1,249

1,616–1,794

2,576–2,777

White

–

–

693–793

1,125–1,241

1,607–1,803

2,586–2,767

HAC-Ep

–

–

670–816

1,084–1,282

1,553–1,857

2,486–2,867

HAC-Tr

–

–

675–811

1,093–1,273

1,564–1,845

2,505–2,848

HAC-Bi

–

–

675–811

1,093–1,274

1,563–1,847

2,503–2,850

The main contribution of including the latter is that it becomes possible to distinguish the direct effect from the total effect. However, taking into account the standard errors (especially from the HAC effects), there does not seem to be a significant difference between the total effect in LAG-end and the estimate obtained for IV. The latter is significantly different from the direct effect estimate under LAGend, so that it would not be appropriate to use the IV results as a welfare measure when pecuniary externalities are underlying the spatial multiplier. A similar comparison holds between the results of OLS and those of the LAG model without endogeneity. It should be noted that the inclusion of the spatial lag has important consequences for the other parameters in the model (such as the neighborhood characteristics) and that we are not suggesting that it should be ignored. However, from a policy perspective, if the sole concern is with an estimate of MWTP irrespective of its composition between direct effects and spatial multiplier effects, the results from a model that accounts for endogeneity (but ignores the spatial lag) may be acceptable. However, ignoring spatial effects leads to unrealistic indications of precision (narrow confidence intervals) which may be misleading in a decision support setting.

Errors in variables and spatial effects in hedonic house price models of ambient air quality

27

6 Conclusion In this paper, we contribute to the empirical literature on the valuation of ambient air quality in spatial hedonic models by considering three novel aspects. First, we considered endogeneity in the form of errors in variables for the interpolated measures of air pollution. This led to a the use of spatial two stage least squares estimation with instruments for the spatially lagged dependent variable as well as the inclusion of the coordinates of house locations and their interaction as instruments for the interpolated pollution values. Second, we implemented the recently developed heteroskedastic and spatially autocorrelation consistent (HAC) estimates for the standard errors to obtain more robust results for the precision of the computed MWTP. Third, we extended the scope of the analysis by including two pollutants in the specification, rather than the traditional focus on a single pollutant. Additionally, we carry this out for one of the largest samples used in the empirical study of spatial hedonic models. Our results underscore the importance of correcting for the errors in variables nature of the interpolated pollution values. The effect is both significant with respect to the coefficient estimates in the hedonic model, as well as for the calculation of the MWTP. For the coefficient estimates, the main changes are seen for the pollution variables and the neighborhood measures. The coefficients for the individual house characteristics were found to be only marginally affected by the estimation method. In all cases, strong evidence was found of spatial error autocorrelation, which persisted even after a spatially lagged dependent variable was included in the model. This provides a solid argument in favor of using HAC estimates of the standard errors. In practice, classical and even White standard errors seriously underestimate the imprecision of the estimates in the presence of remaining spatial correlation and spatial heterogeneity. Interestingly, there is no consistent direction of the bias of the OLS estimates for the pollution variables. Further insight into the precise mechanisms underlying this phenomenon requires additional investigation. The computation of the MWTP is similarly affected by the choice of the estimation method. The need to account for endogeneity is clear and OLS-based calculations are likely to be misleading. Moreover, a spatial lag specification allows for a distinction between direct effects and the role of a spatial multiplier, which are combined in the estimates of the non-spatial models. A number of aspects of estimation were not taken into consideration and remain the subject of future work. Foremost among these is the role of spatial heterogeneity. The strong evidence of remaining heterogeneity and spatial correlation would suggest that perhaps a different scale of analysis might be more appropriate. For example, this might include an explicit accounting for submarkets or for possible sorting of households by preference regarding environmental quality. Finally, the evidence presented here only applies to a single case study, and additional empirical work is needed to start establishing the foundations for general results. It is hoped that accounting for errors in variables of the interpolated pollution measures will become a routine aspect of applied work in spatial hedonic models of ambient air quality.

28

L. Anselin, N. Lozano-Gracia

Appendix Table 9 Estimates from alternative models and standard errors Variable

OLS

IV

Queen weights

knn6 weights

knn12 weights

LAG

LAG-end

LAG

LAG

7.8676

LAG-end

LAG-end

Constant

12.12

12.5281

8.1169

8.4700

8.1911

7.5316

7.900

Classical

(0.0188)

(0.0219)

(0.0730)

(0.07645) (0.0715)

(0.0746)

(0.0727)

(0.0761)

White

(0.02118) (0.0246)

(0.1299)

(0.1336)

(0.1388)

(0.1414)

(0.1344)

(0.1379 )

HAC-Tr

(0.0373)

(0.0457)

(0.1471)

(0.1525)

(0.1607)

(0.1646)

(0.1593)

(0.1649)

HAC-Ep

(0.0418)

(0.0513)

(0.1521)

(0.0764)

(0.1668)

(0.1711)

(0.1653)

(0.1715)

HAC-Bi

(0.0380 )

(0.0465)

(0.1480)

(0.1535)

(0.1622)

(0.1662)

(0.1608)

(0.1666)

0.0012

0.0009

0.0009

0.0009

0.0009

0.0009

0.0009

Landarea 0.0011 Classical

(0.00004) (0.00004) (0.00004) (0.00004) (0.00004) (0.00004) (0.00004) (0.00004)

White

(0.00028) (0.0003)

(0.0002)

(0.0002)

(0.0002)

(0.0002)

(0.0002)

(0.0002)

HAC-Tr

(0.00028) (0.0003)

(0.0002)

(0.0002)

(0.0002)

(0.0002)

(0.0002)

(0.0002)

HAC-Ep

( 0.0002 ) (0.0003)

(0.0002)

(0.00004) (0.0002)

(0.0002)

(0.0002)

(0.0002)

HAC-Bi

(0.0002)

(0.0003)

(0.0002)

(0.0002)

(0.0002)

(0.0002)

(0.0002)

Livarea

2.6057

2.5864

2.2326

2.2237

2.1689

2.1587

2.2004

2.1942

Classical

(0.0207)

(0.02108) (0.0204)

(0.0206)

(0.0204)

(0.0206)

(0.0201)

(0.0203)

White

(0.1235)

(0.1228)

(0.1104)

(0.1098)

(0.1089)

(0.1082)

(0.1084)

(0.1080)

HAC-Tr

(0.1259)

(0.1252)

(0.11141) (0.1108)

(0.1103)

(0.1096)

(0.1094)

(0.1090)

HAC-Ep

(0.1269)

(0.1262)

(0.1120)

(0.0206)

(0.1110)

(0.1103)

(0.1100)

(0.1096)

HAC-Bi

( 0.1259)

(0.1253)

(0.1112)

(0.1107)

(0.1102)

(0.1093)

(0.1093)

(0.1089)

Elevation −0.0004

−0.0029

0.0027

0.0007

0.00007

−0.0018

0.0002

−0.0017

Classical

(0.0057)

(0.0058)

(0.0053)

(0.0054)

(0.0053)

(0.0053)

(0.0053)

(0.0053)

White

(0.0066)

(0.0067)

(0.0062)

(0.0062)

(0.0061)

(0.0062)

(0.0061)

(0.0062)

HAC-Tr

(0.0074)

(0.0077)

(0.0068)

(0.0071)

(0.0066)

(0.0068)

(0.0068)

(0.0071)

HAC-Ep

(0.0075)

(0.0079)

(0.0069)

(0.0073)

(0.0066)

(0.0069)

(0.0069)

(0.0072)

HAC-Bi

(0.0074)

(0.0078)

(0.0069)

(0.0071)

(0.0066)

(0.0068)

(0.0069)

(0.0071)

Baths

0.0471

0.04724

0.0415

0.0416

0.0416

0.0417

0.0409

0.0411

Classical

(0.0019)

(0.0019)

(0.0017)

(0.0017)

(0.0017)

(0.0017)

(0.0017)

(0.0017)

White

(0.0075)

(0.0074)

(0.0061)

(0.0059)

(0.0058)

(0.0059)

(0.0061)

(0.0059)

HAC-Tr

(0.00769) (0.0076)

(0.0062)

(0.0062)

(0.0060)

(0.0060)

(0.0061)

(0.0061)

HAC-Ep

(0.0077)

(0.0472)

(0.0063)

(0.0063)

(0.0060 )

(0.0060)

(0.0061)

(0.0061)

HAC-Bi

(0.0077)

(0.0076)

(0.0062)

(0.0062)

(0.0060)

(0.0060)

(0.0061)

(0.0061)

(0.0002)

Fireplace 0.0457

0.0441

0.0363

0.0352

0.0348

0.0337

0.0352

0.0342

Classical

(0.0017)

(0.0017)

(0.0016)

(0.0016)

(0.0016)

(0.0016)

(0.0016)

(0.0016)

White

(0.0023)

(0.0023)

(0.0020)

(0.0020)

(0.0019)

(0.0019)

(0.0019)

(0.0019)

HAC-Tr

(0.00277) (0.0028)

(0.0023)

(0.0023)

(0.0022)

(0.0022)

(0.0022)

(0.0023)

HAC-Ep

(0.0028)

(0.0029)

(0.0023)

(0.0024)

(0.0023)

(0.0023)

(0.0023)

(0.0023)

HAC-Bi

(0.0028)

(0.0028)

(0.0023)

(0.0023)

(0.0022)

(0.0022)

(0.0023)

(0.0023)

Pool

0.0505

0.0508

0.0438

0.0440

0.0435

0.0437

0.0434

0.0436

Errors in variables and spatial effects in hedonic house price models of ambient air quality

29

Table 9 continued Variable

OLS

IV

Queen weights

knn6 weights

knn12 weights

LAG

LAG

LAG

LAG-end

LAG-end

LAG-end

Classical

(0.0025)

(0.0025)

(0.0023)

(0.0023)

(0.0023)

(0.0023)

(0.0023)

(0.0023)

White

(0.0031)

(0.0031)

(0.0027)

(0.0027)

(0.0026)

(0.0026)

(0.0026)

(0.0026)

HAC-Tr

(0.0034)

(0.0034)

(0.0028)

(0.0028)

(0.0028)

(0.0028)

(0.0028)

(0.0028)

HAC-Ep

(0.0035)

(0.0035)

(0.0029)

(0.0029)

(0.0028)

(0.0028)

(0.0029)

(0.0029)

HAC-Bi

(0.0034)

(0.0034)

(0.0028)

(0.0028)

(0.0028)

(0.0028)

(0.0028)

(0.0028)

Age

−0.0166

−0.0197

−0.0130

−0.0153

−0.0102

−0.0125

−0.0104

−0.0127

Classical

(0.0017)

(0.0018)

(0.0016)

(0.0016)

(0.0016)

(0.0016)

(0.0016)

(0.0016)

White

(0.0026)

(0.0026)

(0.0023)

(0.0023)

(0.0022)

(0.0022)

(0.0022)

(0.0022)

HAC-Tr

(0.0037)

(0.0038)

(0.0028)

(0.0029)

(0.0027)

(0.0028)

(0.0027)

(0.0028)

HAC-Ep

(0.0039)

(0.0041)

(0.0030)

(0.0031)

(0.0028)

(0.0029)

(0.0028)

(0.0029) (0.0028)

HAC-Bi

(0.0037)

(0.0038)

(0.0029)

(0.0029)

(0.0027)

(0.0028)

(0.0027)

Age Sqrd.

0.0190

0.0153

0.0140

0.0113

0.0113

0.0087

0.0110

0.0084

Classical

(0.0017)

(0.0018)

(0.0016)

(0.0016)

(0.0016)

(0.0016)

(0.0016)

(0.0016)

White

(0.0027)

(0.0027)

(0.0024)

(0.0024)

(0.0023)

(0.0023)

(0.0023)

(0.0023)

HAC-Tr

(0.0038)

(0.0039)

(0.0029)

(0.0030)

(0.0028)

(0.0028)

(0.0028)

(0.0029)

HAC-Ep

(0.0041)

(0.0042)

(0.0031)

(0.0032)

(0.0029)

(0.0030)

(0.0029)

(0.0030)

HAC-Bi

(0.0039)

(0.0040)

(0.0029)

(0.0030)

(0.0028)

(0.0029)

(0.0028)

(0.0028)

Beach

0.2405

0.2661

0.1719

0.1934

0.1721

0.1925

0.01688

0.1903

Classical

(0.0079)

(0.0081)

(0.0075)

(0.0076)

(0.0074)

(0.0076)

(0.0074)

(0.0076)

White

(0.0119)

(0.0119)

(0.0114)

(0.0115)

(0.0112)

(0.0113)

(0.0112)

(0.0113)

HAC-Tr

(0.0247)

(0.0247)

(0.0182)

(0.0184)

(0.0174)

(0.0175)

(0.0169)

(0.0171)

HAC-Ep

(0.0275)

(0.0275)

(0.0199)

(0.0200)

(0.0190)

(0.0191)

(0.0183)

(0.0185)

HAC-Bi

(0.0254)

(0.0254)

(0.0186)

(0.0188)

(0.0177)

(0.0179)

(0.0172)

(0.0175)

AC

−0.0249

−0.0229

−0.0159

−0.0150

−0.0148

−0.0138

−0.0144

−0.0136

Classical

(0.0021)

(0.0022)

(0.0020)

(0.0020)

(0.0020)

(0.0020)

(0.0020)

(0.0020)

White

(0.0021)

(0.0021)

(0.0020)

(0.0020)

(0.0019)

(0.0019)

(0.0019)

(0.0019)

HAC-Tr

(0.0029)

(0.0029)

(0.0024)

(0.0024)

(0.0023)

(0.0023)

(0.0023)

(0.0023)

HAC-Ep

(0.0032)

(0.0031)

(0.0025)

(0.0025)

(0.0024)

(0.0024)

(0.0024)

(0.0025)

HAC-Bi

(0.0030)

(0.0030)

(0.0024)

(0.0024)

(0.0023)

(0.0024)

(0.0024)

(0.0024)

Heat

0.0386

0.0363

0.0245

0.0229

0.0235

0.0219

0.0234

0.0219

Classical

(0.0024)

(0.0024)

(0.0022)

(0.0022)

(0.0022)

(0.0022)

(0.0022)

(0.0022)

White

(0.0025)

(0.0025)

(0.0024)

(0.0024)

(0.0024)

(0.0024)

(0.0024)

(0.0024)

HAC-Tr

(0.0033)

(0.0033)

(0.0028)

(0.0028)

(0.0028)

(0.0028)

(0.0028)

(0.0028)

HAC-Ep

(0.0035)

(0.0035)

(0.0029)

(0.0030)

(0.0029)

(0.0029)

(0.0029)

(0.0029)

HAC-Bi

(0.0386)

(0.0030)

(0.0028)

(0.0028)

(0.0028)

(0.0028)

(0.0028)

(0.0028)

Travel time

−0.0649

−0.0579

−0.0541

−0.0494

−0.0519

−0.0472

−0.0516

−0.0472

Classical

(0.00243)

(0.0024)

(0.0022)

(0.0023)

(0.0022)

(0.0022)

(0.0022)

(0.0022)

White

(0.0025)

(0.0025)

(0.0023)

(0.0023)

(0.0023)

(0.0023)

(0.0023)

(0.0023)

HAC-Tr

(0.0049)

(0.0049)

(0.0035)

(0.0035)

(0.0033)

(0.0034)

(0.0033)

(0.0034)

HAC-Ep

(0.0056)

(0.0056)

(0.0038)

(0.0039)

(0.0037)

(0.0037)

(0.0036)

(0.0037)

30

L. Anselin, N. Lozano-Gracia

Table 9 continued Variable

OLS

IV

Queen weights

knn6 weights

knn12 weights

LAG

LAG-end

LAG

LAG-end

LAG

LAG-end (0.0034)

HAC-Bi

(0.0050)

(0.0050)

(0.0035)

(0.0036)

(0.0034)

(0.0034)

(0.0033)

Poverty

0.0142

−0.0210

0.0201

−0.0454

−0.0137

−0.0386

−0.0137

−0.0379

Classical

(0.0152)

(0.0154)

(0.0142)

(0.0143)

(0.0141)

(0.0142)

(0.0140)

(0.0142)

White

(0.0178)

(0.0181)

(0.0164)

(0.0166)

(0.0163)

(0.0165)

(0.0163)

(0.0165)

HAC-Tr

(0.0302)

(0.0318)

(0.0216)

(0.0230)

(0.0206)

(0.0220)

(0.0205)

(0.0219)

HAC-Ep

(0.0336)

(0.0356)

(0.0233)

(0.0251)

(0.0222)

(0.0239)

(0.0218)

(0.0236)

HAC-Bi

(0.0307)

(0.0324)

(0.0218)

(0.0232)

(0.0207)

(0.0221)

(0.0206)

(0.0220)

White

0.3230

0.3179

0.2282

0.2253

0.2241

0.2208

0.2173

0.2151

Classical

(0.0059)

(0.0060)

(0.0057)

(0.0058)

(0.0057)

(0.0057)

(0.0057)

(0.0057)

White

(0.0059)

(0.0060)

(0.0063)

(0.0063)

(0.0063)

(0.0063)

(0.0062)

(0.0062)

HAC-Tr

(0.0117)

(0.0120)

(0.0088)

(0.0089)

(0.0086)

(0.0087)

(0.0084)

(0.0086)

HAC-Ep

(0.0133)

(0.0136)

(0.0096)

(0.0098)

(0.0094)

(0.0096)

(0.0092)

(0.0094)

HAC-Bi

(0.0119)

(0.0122)

(0.0088)

(0.0090)

(0.0086)

(0.0088)

(0.0085)

(0.0087)

Over65

0.1125

0.0376

0.0327

−0.0213

0.0231

−0.0297

0.0239

−0.0276

Classical

(0.0173)

(0.0177)

(0.0162)

(0.0164)

(0.0161)

(0.0163)

(0.0057)

(0.0163)

White

(0.0190)

(0.0189)

(0.0165)

(0.0168)

(0.0162)

(0.0164)

(0.0163)

(0.0165)

HAC-Tr

(0.0340)

(0.0337)

(0.0240)

(0.0246)

(0.0226)

(0.0235)

(0.0224)

(0.0235)

HAC-Ep

(0.0378)

(0.0374)

(0.0260)

(0.0267)

(0.0244)

(0.0256)

(0.0241)

(0.0254)

HAC-Bi

(0.0349)

(0.0346)

(0.0245)

(0.0250)

(0.0229)

(0.0238)

(0.0228)

(0.0238)

College

1.0155

0.9192

0.5988

0.5342

0.5807

0.5156

0.5341

0.4751

Classical

(0.0093)

(0.0098)

(0.0113)

(0.0114)

(0.0111)

(0.0111)

(0.0113)

(0.0113)

White

(0.0110)

(0.0110)

(0.0147)

(0.0141)

(0.0150)

(0.0144)

(0.0149)

(0.0144)

HAC-Tr

(0.0216)

(0.0216)

(0.0204)

(0.0196)

(0.0204)

(0.0196)

(0.0207)

(0.0199)

HAC-Ep

(0.0242)

(0.0244)

(0.0220)

(0.0212)

(0.0219)

(0.0211)

(0.0221)

(0.0213)

HAC-Bi

(0.0221)

(0.0221)

(0.0207)

(0.0199)

(0.0207)

(0.0199)

(0.0210)

(0.0201)

Income

0.0212

0.0224

0.0085

0.0096

0.0083

0.0093

0.0073

0.0085

Classical

(0.0005)

(0.0005)

(0.0005)

(0.0005)

(0.0005)

(0.0005)

(0.0005)

(0.0005)

White

(0.0009)

(0.0009)

(0.0007)

(0.0007)

(0.0007)

(0.0007)

(0.0007)

(0.0007)

HAC-Tr

(0.0014)

(0.0014)

(0.0009)

(0.0009)

(0.0009)

(0.0009)

(0.0009)

(0.0009)

HAC-Ep

(0.0015)

(0.0015)

(0.0010)

(0.0010)

(0.0010)

(0.0010)

(0.0010)

(0.0010)

HAC-Bi

(0.0014)

(0.0014)

(0.0010)

(0.0009)

(0.0009)

(0.0009)

(0.0009)

(0.0009)

Vcrime

−0.3938

−0.2450

−0.2446

−0.1261

−0.2283

−0.1130

−0.2098

−0.0940

Classical

(0.0236)

(0.0251)

(0.0222)

(0.0233)

(0.0220)

(0.0231)

(0.0231)

(0.0231)

White

(0.0239)

(0.0255)

(0.0234)

(0.0242)

(0.0234)

(0.0243)

(0.0234)

(0.0242)

HAC-Tr

(0.0451)

(0.0496)

(0.0325)

(0.0356)

(0.0311)

(0.0342)

(0.0308)

(0.0340)

HAC-Ep

(0.0510)

(0.0562)

(0.0355)

(0.0394)

(0.0340)

(0.0378)

(0.0332)

(0.0372)

HAC-Bi

(0.0459)

(0.0504)

(0.0327)

(0.0359)

(0.0312)

(0.0344)

(0.0309)

(0.0341)

API

0.0007

0.0011

0.0007

0.0017

0.0011

0.0013

0.0016

0.0018

Classical

(0.0013)

(0.0013)

(0.0012)

(0.0012)

(0.0012)

(0.0012)

(0.0012)

(0.0012)

White

(0.0014)

(0.0014)

(0.0012)

(0.0013)

(0.0063)

(0.0012)

(0.0012)

(0.0012)

Errors in variables and spatial effects in hedonic house price models of ambient air quality

31

Table 9 continued Variable

OLS

IV

Queen weights

knn6 weights

LAG

LAG

LAG-end

knn12 weights

LAG-end LAG

LAG-end

HAC-Tr

(0.0028)

(0.0029)

(0.0020)

(0.0020)

(0.0019)

(0.0019)

(0.0018)

(0.0019)

HAC-Ep

(0.0032)

(0.0033)

(0.0022)

(0.0019)

(0.0021)

(0.0022)

(0.0020)

(0.0021)

HAC-Bi

(0.0029)

(0.0029)

(0.0020)

(0.0021)

(0.0019)

(0.0020)

(0.0019)

(0.0019)

Distance Parks −0.0287 −0.0395 −0.0213 −0.0298

−0.0210 −0.0292

−0.0205 −0.0289

classical

(0.0011)

(0.0012)

(0.0010)

(0.0011)

(0.0010)

(0.0011)

(0.0012)

(0.0011)

White

(0.0011)

(0.0012)

(0.0010)

(0.0011)

(0.0010)

(0.0011)

(0.0010)

(0.0011)

HAC-Tr

(0.0021)

(0.0025)

(0.0015)

(0.0017)

(0.0014)

(0.0017)

(0.0014)

(0.0017)

HAC-Ep

(0.0024)

(0.0028)

(0.0016)

(0.0019)

(0.0015)

(0.0018)

(0.0015)

(0.0018)

HAC-Bi

(0.0021)

(0.0025)

(0.0015)

(0.0018)

(0.0014)

(0.0017)

(0.0014)

(0.0017)

SB

−0.1411 0.0652

−0.0938 −0.0413

−0.0897 −0.0384

0.0870

−0.0376

Classical

(0.0040)

(0.0044)

(0.0038)

(0.0041)

(0.0038)

(0.0041)

(0.0038)

(0.0041)

White

(0.0037)

(0.0040)

(0.0036)

(0.0037)

(0.0063)

(0.0036)

(0.0036)

(0.0036)

HAC-Tr

(0.0068)

(0.0084)

(0.0050)

(0.0060)

(0.0049)

(0.0058)

(0.0048)

(0.0058)

HAC-Ep

(0.0077)

(0.0096)

(0.0055)

(0.0068)

(0.0053)

(0.0065)

(0.0052)

(0.0065)

HAC-Bi

(0.0069)

(0.0085)

(0.0051)

(0.0061)

(0.0049)

(0.0058)

(0.0049)

(0.0058)

RI

−0.1405 −0.0025 −0.0977 0.0006

−0.0938 0.0021

−0.0915 0.0021

Classical

(0.0054)

(0.0065)

(0.0050)

(0.0059)

(0.0050)

(0.0059)

(0.0050)

(0.0058)

White

(0.0054)

(0.0062)

(0.0052)

(0.0057)

(0.0052)

(0.0057)

(0.0052)

(0.0057)

HAC-Tr

(0.0102)

(0.0125)

(0.0073)

(0.0085)

(0.0070)

(0.0082)

(0.0069)

(0.0081)

HAC-Ep

(0.0114)

(0.0142)

(0.0079)

(0.0094)

(0.0076)

(0.0091)

(0.0074)

(0.0089)

HAC-Bi

(0.0103)

(0.0127)

(0.0073)

(0.0086)

(0.0070)

(0.0083)

(0.0069)

(0.0081)

OR

−0.0077 0.0579

−0.0126 0.0370

−0.0084 0.0397

−0.0097 0.0384

Classical

(0.0032)

(0.0040)

(0.0030)

(0.0037)

(0.0030)

(0.0030)

(0.0036)

(0.0036)

white

(0.0032)

(0.0039)

(0.0030)

(0.0037)

(0.0029)

(0.0036)

(0.0029)

(0.0036)

HAC-Tr

(0.0066)

(0.0085)

(0.0047)

(0.0060)

(0.0044)

(0.0057)

(0.0044)

(0.0057)

HAC-Ep

(0.0075)

(0.0098)

(0.0052)

(0.0067)

(0.0049)

(0.0064)

(0.0048)

(0.0063)

HAC-Bi

(0.0067)

(0.0087)

(0.0047)

(0.0061)

(0.0045)

(0.0058)

(0.0044)

(0.057)

Highway1

−0.0199 −0.0234 −0.0130 −0.0155

−0.0116 −0.0121

−0.0129 −0.0154

Classical

(0.0030)

(0.0030)

(0.0028)

(0.00283) (0.0028)

(0.0028)

(0.0028)

(0.0028)

White

(0.0030)

(0.0032)

(0.0029)

(0.0030)

(0.0030)

(0.0029)

(0.0030)

(0.0029)

HAC-Tr

(0.0051)

(0.0054)

(0.0038)

(0.0040)

(0.0035)

(0.0038)

(0.0036)

(0.0039)

HAC-Ep

(0.0056)

(0.0060)

(0.0040)

(0.0043)

(0.0037)

(0.0041)

(0.0038)

(0.0041)

HAC-Bi

(0.0052)

(0.0055)

(0.0038)

(0.0041)

(0.0036)

(0.0038)

(0.0036)

(0.0039)

Highway2

0.0028

0.0027

0.0043

0.0043

0.0043

0.0040

0.0046

0.0047

Classical

(0.0018)

(0.0018)

(0.0017)

(0.0017)

(0.0017)

(0.0017)

(0.0017)

(0.0017)

White

(0.0018)

(0.0018)

(0.0017)

(0.0017)

(0.0017)

(0.0017)

(0.0017)

(0.0017)

HAC-Tr

(0.0034)

(0.0036)

(0.0024)

(0.0025)

(0.0023)

(0.0024)

(0.0022)

(0.0024)

HAC-Ep

(0.0038)

(0.0040)

(0.0026)

(0.0028)

(0.0025)

(0.0027)

(0.0024)

(0.0026)

HAC-Bi

(0.0034)

(0.0036)

(0.0024)

(0.0026)

(0.0036)

(0.0024)

(0.0023)

(0.0024)

OZ

−0.0253 −0.0137 −0.0195 −0.0099

−0.0182 −0.0089

−0.0180 −0.0085

32

L. Anselin, N. Lozano-Gracia

Table 9 continued Variable

OLS

IV

Queen weights LAG

knn6 weights

LAG-end LAG

knn12 weights

LAG-end LAG

LAG-end (0.0009)

Classical

(0.0008) (0.0010)

(0.0007) (0.0009)

(0.0007)

(0.0009)

(0.0007)

White

(0.0008) (0.0011)

(0.0008) (0.0010)

(0.0008)

(0.0010)

(0.00008) (0.0010)

HAC-Tr

(0.0016) (0.0023)

(0.0011) (0.0016)

(0.0011)

(0.0015)

(0.0011)

(0.0015)

HAC-Ep

(0.0018) (0.0026)

(0.0012) (0.0017)

(0.0012)

(0.0017)

(0.0012)

(0.0017)

HAC-Bi

(0.0016) (0.0024)

(0.0011) (0.0016)

(0.0011)

(0.0015)

(0.0011)

(0.0015)

TSP

−0.0047 −0.0102

−0.0032 −0.0073

−0.0031

−0.0071

−0.0030

−0.0070

Classical

(0.0001) (0.0001)

(0.0001) (0.0001)

(0.00009) (0.0001)

(0.00009) (0.0001)

White

(0.0001) (0.0001)

(0.0001) (0.0002)

(0.0001)

(0.0002)

(0.0001)

HAC-Tr

(0.0001) (0.0004)

(0.0001) (0.0002)

(0.0001)

(0.0002)

(0.0001)

(0.0002)

HAC-Ep

(0.0002) (0.0004)

(0.0001) (0.0003)

(0.0001)

(0.0003)

(0.0001)

(0.0003)

HAC-Bi

(0.0001) (0.0004)

(0.0001) (0.0003)

(0.0001)

(0.0002)

(0.0001)

(0.0002)

ρ

−

−

0.3314

0.3514

0.3484

0.3787

0.3716

Classical

−

−

(0.0058) (0.0059)

(0.0057)

(0.0057)

(0.0058)

(0.0058)

White

−

−

(0.0105) (0.0104)

(0.0112)

(0.0111)

(0.0109)

(0.0108)

HAC-Tr

−

−

(0.0117) (0.0117)

(0.0129)

(0.0128)

(0.0128)

(0.0127)

HAC-Ep

−

−

(0.0121) (0.0121)

(0.0134)

(0.0132)

(0.0132)

(0.0131)

HAC-Bi

−

−

(0.0118) (0.0117)

(0.0130)

(0.0129)

(0.0129)

(0.0128)

R 2 (var ratio) 0.7761

0.7947

0.7814

0.8017

0.7833

0.8038

0.7849

0.8055

AK test

18323.48 60.24

137.46

146.9

242.05

564.72

646.04

p-value

0

0

0

0

0

0

0

0.3266

(0.0002)

References Andrews DW (1991) Heteroscedasticity and autocorrelation consistent covariance matrix estimation. Econometrica 59:817–858 Anselin L (1980) Estimation methods for spatial autoregressive structures. Regional Science Dissertation and Monograph Series, Cornell University, Ithaca Anselin L (1988) Spatial econometrics: methods and models. Kluwer, Dordrecht Anselin L (1998) GIS research infrastructure for spatial analysis of real estate markets. J Housing Res 9(1):113–133 Anselin L (2001a) Rao’s score test in spatial econometrics. J Stat Plan Inf 97:113–139 Anselin L (2001b) Spatial econometrics. In: Baltagi B (ed) A companion to theoretical econometrics. Blackwell, Oxford, pp. 310–330 Anselin L (2001c) Spatial effects in econometric practice in environmental and resource economics. Am J Agric Econ 83(3):705–710 Anselin L (2002) Under the hood. Issues in the specification and interpretation of spatial regression models. Agric Econ 27(3):247–267 Anselin L (2006) Spatial econometrics. In: Mills T, Patterson K (eds) Palgrave handbook of econometrics, vol 1, Econometric Theory. Palgrave Macmillan, Basingstoke, pp 901–969 Anselin L, Bera A (1998) Spatial dependence in linear regression models with an introduction to spatial econometrics. In: Ullah A, Giles DE (ed) Handbook of applied economic statistics. Marcel Dekker, New York, pp 237–289 Anselin L, Kelejian HH (1997) Testing for spatial error autocorrelation in the presence of endogenous regressors. Int Reg Sci Rev 20:153–182

Errors in variables and spatial effects in hedonic house price models of ambient air quality

33

Anselin L, Le Gallo J (2006) Interpolation of air quality measures in hedonic house price models: spatial aspects. Spat Econ Anal 1:31–52 Anselin L, Bera A, Florax R, Yoon M (1996) Simple diagnostic tests for spatial dependence. Reg Sci Urban Econ 26:77–104 Basu S, Thibodeau TG (1998) Analysis of spatial autocorrelation in house prices. J Real Estate Finance Economics 170(1):61–85 Bayer P, Keohane N, Timmins C (2006) Migration and hedonic valuation: the case of air quality. NBER Working Papers Series, Paper No. 12106 Beron KJ, Murdoch JC, Thayer MA (1999) Hierarchical linear models with application to air pollution in the South Coast Air Basin. Am J Agric Econ 81:1123–1127 Beron K, Murdoch J, Thayer M (2001) The benefits of visibility improvement: New evidence from Los Angeles metropolitan area. J Real Estate Finance Econ 22(2–3):319–337 Beron KJ, Hanson Y, Murdoch JC, Thayer MA (2004) Hedonic price functions and spatial dependence: implications for the demand for urban air quality. In: Anselin L, Florax RJ, Rey SJ (eds) Advances in spatial econometrics: methodology, tools and applications. Springer-Verlag, Berlin, pp 267–281 Bourassa S, Hamelink F, Hoesli M, MacGregor B (1999) Defining residential submarkets. J Housing Econ 8:160–183 Boyle MA, Kiel KA (2001) A survey of house price hedonic studies of the impact of environmental externalities. J Real Estate Lit 9:117–144 Brasington DM, Hite D (2005) Demand for environmental quality: a spatial hedonic analysis. Reg Sci Urban Econ 35:57–82 Brueckner JK (2003) Strategic interaction among governments: An overview of empirical studies. Int Reg Sci Rev 26(2):175–188 Cameron AC, Trivedi PK (2005) Microeconometrics: methods and applications. Cambridge University Press, Cambridge Chay KY, Greenstone M (2005) Does air quality matter? evidence form the housing market. J Polit Econ 113(2):376–424 Conley TG (1999) GMM estimation with cross-sectional dependence. J Econom 92:1–45 Dubin R, Pace RK, Thibodeau TG (1999) Spatial autoregression techniques for real estate data. J Real Estate Lit 7:79–95 Freeman AM III (2003) The measurement of environmental and resource values, theory and methods, 2nd edn. Resources for the Future Press, Washington Friedman M (1957) A theory of the consumption function. Princeton University Press, Princeton Gillen K, Thibodeau TG, Wachter S (2001) Anisotropic autocorrelation in house prices. J Real Estate Finance Econ 23(1):5–30 Graves P, Murdoch JC, Thayer MA, Waldman D (1988) The robustness of hedonic price estimation: urban air quality. Land Econ 64:220–233 Greene WH (2003) Econometric analysis, 5th edn. Prentice Hall, Upper Saddle River Hall P, Patil P (1994) Properties of nonparametric estimators of autocovariance for stationary random fields. Prob Theory Related Fields 99:399–424 Härdle W (1990) Applied nonparametric regression. Cambridge University Press, Cambridge Harrison D, Rubinfeld DL (1978) Hedonic housing prices and the demand for clean air. J Environ Econ Manage 5:81–102 Kelejian HH, Prucha IR (1998) A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. J Real Estate Finance Econ 17(1): 99–121 Kelejian HH, Prucha IR (2006a) HAC estimation in a spatial framework. J Econom (forthcoming) Kelejian HH, Prucha IR (2006b) Specification and estimation of spatial autoregressive models with autoregressive and heteroskedastic disturbances. Working paper, Department of Economics, University of Maryland, College Park Kelejian HH, Robinson DP (1993) A suggested method of estimation for spatial interdependent models with autocorrelated errors, and an application to a county expenditure model. Papers Reg Sci 72: 297–312 Kelejian HH, Prucha IR, Yuzefovich Y (2004) Instrumental variable estimation of a spatial autoregressive model with autoregressive disturbances: large and small sample results. In: LeSage JP, Pace RK (eds) Advances in econometrics: spatial and spatiotemporal econometrics. Elsevier Science Ltd., Oxford, pp 163–198

34

L. Anselin, N. Lozano-Gracia

Kim CW, Phipps T, Anselin L (2003) Measuring the benefits of air quality improvement: a spatial hedonic approach. J Environ Econ Manage 45:24–39 Lee L-F (2003) Best spatial two-stage least squares estimators for a spatial autoregressive model with autoregressive disturbances. Econom Rev 22:307–335 Lee L-F (2006) GMM and 2SLS estimation of mixed regressive, spatial autoregressive models. J Econom (forthcoming) Lin X, Lee L-F (2005) GMM estimation of spatial autoregressive models with unknown heteroskedasticity. Working paper, The Ohio State University, Columbus Manski CF (2000) Economic analysis of social interactions. J Econ Perspect 14(3):115–136 Moulton BR (1990) An illustration of a pitfall in estimating the effects of aggregate variables on micro units. Rev Econ Stat 72:334–338 Newey WK, West KD (1987) A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica 55:703–708 Pace RK, LeSage JP (2004) Spatial statistics and real estate. J Real Estate Finance Econ 29:147–148 Pace KR, Barry R, Clapp JM, Rodriguez M (1998) Spatial autocorrelation and neighborhood quality. J Real Estate Finance Econ 17(1):15–33 Palmquist RB (1991) Hedonic methods. In: Braden JB, Kolstad CD (eds) Measuring the demand for evironmental quality. North Holland, Amsterdam, pp 77–120 Ridker R, Henning J (1967) The determinants of residential property values with special reference to air pollution. Rev Econ Stat 49:246–257 Simonoff JS (1996) Smoothing methods in statistics. Springer, Heidelberg Small KA, Steimetz S (2006) Spatial hedonics and the willingness to pay for residential amenities. Economics working paper no. 05-06-31, University of California, Irvine Smith VK, Huang J-C (1993) Hedonic models and air pollution: 25 years and counting. Environ Resource Econ 3:381–394 Smith VK, Huang J-C (1995) Can markets value air quality? a meta-analysis of hedonic property value models. J Polit Econ 103:209–227 Smith VK, Sieg H, Banzhaf HS, Walsh RP (2004) General equilibrium benefits for environmental improvements: projected ozone reductions under EPA’s Prospective Analysis for the Los Angeles air basin. J Environ Econ Manage 47:559–584 Staiger D, Stock JH (1997) Instrumental variables regression with weak instruments. Econometrica 65(3):557–586 White H (1980) A heteroskedastic-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48:817–838

A generalized method of moments estimator for a spatial model with moving average errors, with application to real estate prices Bernard Fingleton

Abstract This paper proposes a new GMM estimator for spatial regression models with moving average errors. Monte Carlo results are given which suggest that the GMM estimates are consistent and robust to non-normality, and the Bootstrap method is suggested as a way of testing the significance of the moving average parameter. The estimator is applied in a model of English real estate prices, in which the concepts of displaced demand and displaced supply are introduced to derive the spatial lag of prices, and the moving average error process represents spatially autocorrelated unmodelled variables. Keywords Moving averages · GMM · Real estate · Spatial econometrics JEL Classification R31 · R12 · C21 1 Introduction A recently proposed GMM estimator (Kelejian and Prucha 1998) for regression models with a spatial error process (Anselin 1988a,b, 2003) assumes spatially autoregressive errors (AR), with the implication that shocks are transmitted globally. It is apparent that there could be applications of Kelejian and Prucha—type procedures to other kinds of models, for instance Kelejian and Prucha (1999) mention that higher order spatial models, involving more than one spatial lag of the disturbance term as well as the innovation term, have been considered in the literature. Apart from these autoregressive specifications, one important model that could also be considered is a regression model with a spatial moving average (MA) error process (Haining 1978), as described in this

B. Fingleton (B) Department of Land Economy, Cambridge University, 19 Silver Street, Cambridge CB3 9EP, UK e-mail: [email protected]

36

B. Fingleton

paper. The rationale for assuming a spatial moving average error process is that in some cases it may be more realistic to treat the transmission of shocks as a local rather than a global phenomenon. Building on the approach of Kelejian and Prucha (1998), in this paper a new GMM estimator is suggested for spatial MA errors, and Monte Carlo methods are used to illustrate the properties of the new estimator. The Monte Carlo results suggest consistency and robustness. The paper also applies the estimator to real data on house prices, and uses of the Bootstrap as a method of assessing the significance of the MA error process. In the house price reduced form, the spatial lag of prices is derived as the net effect of displaced demand and displaced supply. 2 AR and MA error processes Consider the regression specification Y = Xb + u

(2.1)

in which Y is an n × 1 vector of observations of the dependent variable, X is the n × k matrix of regressors, b is a k × 1 vector of coefficients and u is an n × 1 vector produced by a random error process. The AR process is u = ρ1 W u + ξ

(2.2)

in which W is an n × n matrix of non-stochastic weights, which we take to be row-normalized with row sums equal to 1. Also ρ1 is a scalar parameter1 with |ρ1 | < 1, and ξ is a vector of n × 1 identically distributed independent innovations with2 Eξ j = 0 and Eξ 2j = σ 2 with 0 < a− σ ≤ σ 2 ≤ a¯ σ < ∞. We also make the

standard assumption that the innovations have finite fourth moments (Eξ 4j < ∞) thus ensuring a finite domain for estimation. The AR process implies complex interdependence between locations, so that a shock at location j is transmitted to all other locations, as indicated by the expansion of u = (I − ρ1 W )−1 ξ

(2.3)

which is (I − ρ1 W )−1 ξ =

∞ i=0

ρ1i W i ξ = ξ + ρ1 W ξ + ρ12 W 2 ξ + ρ13 W 3 ξ + · · ·

(2.4)

in which W 0 = I, W 2 is the matrix product of W and W , and W i is the matrix product of W i−1 and W . The effect of shock at j is therefore felt directly at j, and there is a 1 The subscript 1 has been used to distinguish this from the equivalent parameter ρ for the MA errors 2 process. 2 E is the expectation.

A generalized method of moments estimator

37

first order effect due to ρ1 W ξ which affects only those location pairs for which there is a non-zero element on the W matrix. If W was a contiguity matrix we might think of these as local effects. The global effect of a shock occurs because it is transmitted also to locations that are ‘neighbours of neighbours’ via the powers of W . Note that the effect rebounds. A shock to j affects the neighbours, and the neighbours of the neighbours, and eventually works its way back to j. In other words the full effect of a shock to j is not simply the shock itself, but the initial shock plus the feedback from the other locations. To summarise, even though some cells of W may be zero, indicating non-zero direct interaction between some locations, given the matrix (I − ρ1 W )−1 with no zero cells, each cell of u depends on ξ j . In contrast the MA error process is u = (I − ρ2 W ) ξ

(2.5)

so that a shock at location j will only affect the directly interacting locations as given by the non-zero elements in W . Hence shock-effects are local rather than global. Consider next the regression specification Y = λW E Y + X b + u

(2.6)

in which W E is another n × n matrix of non-stochastic weights, and W E is row-normalised with row sums equal to 1, with the spatial lag W E Y an endogenous n × 1 vector. With the AR error process, with u given by (2.2), the reduced form is −1 −1 X b + I − λW E Y = I − λW E (I − ρ1 W )−1 ξ

(2.7)

−1 −1 X b + I − λW E Y = I − λW E (I − ρ2 W ) ξ

(2.8)

which is the model discussed by Kelejian and Prucha (1998). We refer to this as a SARAR model (see Anselin and Florax 1995). With the MA error process, with u given by (2.5), the reduced form is

which is the SARMA model of Anselin and Bera (1998). For the SARAR model, a shock at location j affects all other locations, with the global spillover due to the AR error process amplified by the extra spatial multiplier effect due to the SAR . As noted by Anselin (2003), the ‘induced pattern of spatial dependence for the error term is much more complex and involves the interaction between the two spatial parameters as well as the two spatial weights’. This is exemplified3 by comparing Figs. 1 and 2. These show contour lines for the impact of a shock at location j, which is the central cell of a 15 × 15 lattice. Figure 1 shows the outcome for the AR model, and Fig. 2 is what happens under the SARAR model. Figure 3 shows the contour lines for the MA model. Here spillover terminates abruptly 3 These Figures were obtained using the same assumptions as used to generate Table 1, except that ρ2 = −0.9, ρ1 = 0.9, λ = 0.5, 0.ξ ∼ N (0, 1), ξk′ = ξk , k = 1, . . . , n, k = j; ξ ′j = ξ j + 3.

38

B. Fingleton

Fig. 1 Shock effects with AR errors

Fig. 2 Shock effects under SARAR

at contiguous cells. Figure 4 shows the impacts under the SARMA model. The sharp cut-off of the MA specification is moderated by the SAR element of the model. Each impact is the difference in Y as a result of using ξ and ξ ′ . We generalize the SARMA specification given above to include endogenous regressors additional to the spatial lag W E Y , denoting these by the n × c matrix R. Hence Y = λW E Y + H γ + Rη + u X = W E Y, H, R b′ = λ, γ ′ , η′ Y = Xb + u

(2.9)

A generalized method of moments estimator

39

Fig. 3 Shock effects with MA errors

Fig. 4 Shock effects under SARMA

in which H is an n × k matrix of (exogenous) regressors and X is the n × (1 + k + c) matrix of right hand side variables. Also γ is a k × 1 vector and η is a c × 1 vector of parameters. The n × 1 vector of moving average disturbances is u = (I − ρ2 W ) ξ = ξ − ρ2 W ξ = ξ − ρ2 ξ¯

(2.10)

3 Estimating the SARMA model via GMM Kelejian and Prucha (1998) give formal results leading to a feasible generalized spatial two stage least squares (GS2SLS) estimator for the parameters of the SARAR model.

40

B. Fingleton

The present set-up is different since I assume that the error process is a spatial moving average (SARMA) rather than a spatial autoregressive process. I also suggest in equation (2.9) that it would be useful from the applied perspective to allow endogenous variables additional to the endogenous lag, as in Kelejian and Prucha (2004), although in practice the application and simulations in this paper are restricted to the case of exogenous regressors. The low level assumptions made by Kelejian and Prucha (1998) are the basis of theorems they prove to show the consistency of their estimators and appropriate distributional approximations allowing small sample inference. Although the set-up is different here, one might anticipate that similar formal results might ensue for SARMA since some of the assumptions would be clearly the same. While there is a need, in a formal analysis, to fully identify assumptions appropriate to the SARMA context and to explore the implications of these, the aim of this present paper is not to follow this formal route, but to show by Monte-Carlo methods and by application that the proposed estimator is apparently consistent and does provide a practical method of analysis. To summarize, in this paper I give a similar estimator to the GS2SLS proposed by Kelejian and Prucha (1998) but for the SARMA model, and obtain Monte Carlo results which suggest consistency. The method comprises three stages. In the first stage the model is estimated by 2SLS. The second stage uses the resulting 2SLS residuals to estimate ρ2 and σ 2 using a GM procedure. This is based on some new moment relationships I have obtained using the SARMA specification. In the third stage, the estimated ρ2 is used to perform a Cochrane–Orcutt type transformation to account for the spatial dependence in the residuals. Note that this involves multiplying through −1 rather than by I − ρˆ1 W as required for the SARAR model. Given by I − ρˆ2 W a very large W matrix, accurate determination of the inverse may be computationally −1 may turn out to be singular, so Moore-Penrose generchallenging and I − ρˆ2 W alized inverses are used to avoid singularities. Note that Kelejian and Prucha (1998, 1999) assume nonsingularity for the matrix (I − ρ1 W ) for all |ρ1 | < 1. In the first stage, as instruments we employ a linearly independent subset of the exogenous variables and their low order lags,4 since including high order lags will tend to induce linear dependence. This gives Z which is an n × f matrix of instruments, comprising the exogenous variables, H , and their first and second spatial lags (W E H and W E2 H ). Assume matrices X and Z are full column rank with f ≥ (c + k + 1), −1 and Pz = Z Z ′ Z Z is the symmetric and idempotent projection matrix, so that Xˆ = Pz X and −1 Xˆ ′ Y bˆ = Xˆ ′ Xˆ

Yˆ = X bˆ

(3.1)

uˆ = Y − Yˆ 4 In practice in the simulations reported, first and second spatial lags are used, in other words the matrix products of the normalised W matrix and the exogenous variables, denoted by W E H , and the matrix products of W E and W E H .

A generalized method of moments estimator

41

The estimated b equates to regressing Y on the fitted values of the endogenous variables and on the exogenous regressors. Note that if Z = X then Xˆ = X and this estimator reduces to OLS. In order to develop the equation system, consider next the lagged residuals u¯ = W u = W (ξ − ρ2 ξ¯ ) = ξ¯ − ρ2 ξ¯¯

(3.2)

Squaring and summing the residuals and the lagged residuals, and summing the vector of their products, and dividing by n, gives n −1

(ξ − ρ2 ξ¯ )(ξ − ρ2 ξ¯ )

(ξ¯ − ρ2 ξ¯¯ )(ξ¯ − ρ2 ξ¯¯ )

= n −1 n −1

ξ 2 + ρ22

ξ¯ 2 + ρ22

= n −1 n −1

ξ¯ 2 − 2ρ2

ξ¯¯ 2 − 2ρ2

(ξ − ρ2 ξ¯ )(ξ¯ − ρ2 ξ¯¯ )

= n −1

ξ ξ¯ + ρ22

ξξ¯¯ − ρ2

2 ξ ξ¯ = n −1 u 2 u¯ ξξ¯¯ = n −1

ξ¯ 2 −ρ2

uu ¯ ξ ξ¯¯ = n −1

(3.3)

(3.4)

(3.5)

Expectations are taken across the terms in (3.3), (3.4) and (3.5) to give equations (3.11), (3.12) and (3.13), making use of the results in (3.6) to (3.10). It is possible to show the following (see Kelejian and Prucha 1999) ξ2 = σ2 E n −1 (3.6) E n −1 ξ¯ 2 = n −1 E[T r (ξ ′ W ′ W ξ ] = n −1 T r (Eξ ′ ξ W ′ W ) = n −1 σ 2 T r (W ′ W )

in which Tr(.) denotes the trace of a matrix and E is the expectation. In addition, I obtain the new expressions E n −1 ξ¯¯ 2 = n −1 E[T r ξ ′ W ′ W ′ W W ξ ] = n −1 T r (Eξ ′ ξ W ′ W W ′ W ) = n −1 σ 2 T r (W ′ W W ′ W )

E(n −1 ξ ξ¯¯ ) = n −1 T r (Eξ ′ W W ξ ) = n −1 σ 2 T r (W W )

(3.7) (3.8)

I also use the fact that ξ ξ¯ = 0 E n −1

(3.9)

42

B. Fingleton

and

These lead to

ξξ¯¯ = n −1 σ 2 T r W ′ W W E n −1

(3.10)

(3.11)

n −1 σ 2 T r (W ′ W ) + n −1 ρ22 σ 2 T r (W ′ W W ′ W ) 2 u¯ − 2ρ2 n −1 σ 2 T r W ′ W W = n −1 E

(3.12)

ρ22 n −1 σ 2 T r W ′ W W − n −1 ρ2 σ 2 T r (W ′ W ) uu ¯ − n −1 ρ2 σ 2 T r (W W ) = n −1 E

(3.13)

σ 2 + n −1 ρ22 σ 2 T r (W ′ W ) − 0 = n −1 E

u

2

Given that the 2SLS residuals are uˆ and their spatial lags are uˆ¯ = W u, ˆ the expectations on the right hand side of equations (3.11), (3.12) and (3.13) are replaced by ′ ˆ uˆ¯ uˆ¯ and uˆ ′ uˆ¯ in order to estimate ρ2 and σ 2 . their empirical estimates uˆ ′ u, To simplify the notation let us define the following t1 = T r (W ′ W ) t2 = T r (W ′ W W ′ W ) t3 = T r (W ′ W W ) t4 = T r (W W )

(3.14a) (3.14b) (3.14c) (3.14d)

Let ⎡

and

1

⎢t 1 G=⎢ ⎣n 0

t1 n t2 n t3 n

⎡

⎢ ⎢ ⎢ ⎢ g=⎢ ⎢ ⎢ ⎣

0

0

2t3 n t1 n

⎥ 0⎥ ⎦

⎤ uˆ ′ uˆ n ⎥ ⎥ ′ ⎥ uˆ¯ uˆ¯ ⎥ ⎥ n ⎥ ⎥ uˆ ′ uˆ¯ ⎦ n

⎤

(3.15)

t4 n

(3.16)

A generalized method of moments estimator

43

Using equations (3.11), (3.12) and (3.13) we obtain ′ G σ 2 ρ22 σ 2 −ρ2 σ 2 −ρ2 σ 2 − g = ζ ( ρ2 σ 2 )

in which ζ ( ρ2 σ 2 ) is a vector of residuals, and the nonlinear least squares estimators5 are given by (ρˆ2 , σˆ 2 ) = arg min{ζ (ρ2 , σ 2 )′ ζ (ρ2 , σ 2 )}

(3.17)

In the third stage I use the current estimate ρˆ2 to carry out Cochrane-Orcutt type transformations (though this requires the matrix inversion, in contrast to estimating SARAR as in Kelejian and Prucha (1999). Moore-Penrose generalized inverses are used to avoid singularities). −1 Y Y ∗ = I − ρˆ2 W −1 X X ∗ = I − ρˆ2 W

Xˆ ∗ = Pz X ∗ −1 bˆ ∗ = Xˆ ∗′ Xˆ ∗ Xˆ ∗′ Y ∗

(3.18)

From this I obtain a new set of residuals

u˜ = Y − X bˆ ∗

(3.19)

˜¯ which are used in place of the initial values in g to and thus new lagged residuals u, obtain revised estimates ρˆ2 and σˆ 2 again using the GM procedure of stage 2. Stage 3 gives revised residuals u˜˜ so there is the option to again repeat the process.6 Note that with Z = X this procedure reduces to estimation of a MA model with exogenous regressors. On termination of the iterations, the estimated variance-covariance matrix of the parameters is given by Cˆ = σˆ 2 ( Xˆ ∗′ Xˆ ∗ )−1

(3.20)

and the standard errors of the coefficients bˆ ∗ are given by the square roots of the values ˆ thus enabling ‘t-ratios’ to be calculated, with exception of on the main diagonal of C, 5 I use unconstrained non-linear least squares estimation via a modified Newton-Raphson method which

is suitable for minimising any non-linear function. This depends on numerical differences, so there is no need to specify derivatives. 6 After

a few iterations we find that the difference between subsequent values of arg min{ζ (ρ2 , σ 2 )′ ζ (ρ2 , σ 2 )} becomes sufficiently small (using a cut-off criterion of less than 0.000000001) to terminate. The parameter estimates are taken as those for which arg min{ζ (ρ2 , σ 2 )′ ζ (ρ2 , σ 2 )} is minimized overall.

44

B. Fingleton

Table 1 Parameter distributions under SARMA Mean λˆ k

St. dev.

0.7500

0.0002

γˆ0k

1.0017

0.6756

γˆ1k

9.9999

0.0024

γˆ2k

9.9999

0.0024

ρˆ2k

−0.4736

0.1164

σˆ k2

0.9802

0.0469

Skewness

Kurtosis

−0.14

−0.13

0.11

13.07

−0.36

−0.15

9.16

−0.31

−0.19

8.30

0.04

−0.04 0.03

Normality 5.12

−0.28

8.62

−0.39

15.09

Skewness is calculated as (xi − m)3 /(n − 1)s 3 ; Kurtosis is calculated as {(xi − m)4 /(n − 1)s 4 } − 3; where, for variable x, m = xi /n. The goodness of fit to the normal distribution is indicated by the residual deviance which has an asymptotic chi-squared distribution with the specified degrees of freedom. The table √ is formed by dividing the data into n groups of approximately equal observed frequency. The degrees of freedom are n − p − 1, where n is the number of cells in the table of fitted values and p is the number of parameters (2) estimated in the model. Here there are 7 df Table 2 Parameter distributions under SARMA with non-normal errors l. Normal Mean λˆ k γˆ0k

γˆ1k

Mix of normals St. dev.

Skew.

0.7500

0.0002

0.9460 10.0001

γˆ2k

9.9999

σˆ k2

0.9420

ρˆ2k −0.1822

Kurt.

Normal

Mean

St. dev.

Skew.

0.49

1.05

4.93

0.7499

0.23

5.22

0.7596 −0.34

−0.07

3.83

1.2351

0.7985

0.50

0.14

2.26

0.22

−0.47

7.35

0.78

6.78

−0.25

3.92

0.0002 −0.40

0.0023 −0.34

−0.25

3.86

9.9998

0.0022

19.19

18.32

9.9999

0.0775 −0.36

0.41

9.04

29.87

48.83

−0.1802

0.0025 −0.17

0.0026 −3.30 0.2894

4.52

0.9733

0.0804 −0.39 0.1983

0.27

Kurt.

−0.62

Normal

8.53

Normality tested using 7 degrees of freedom

the t-ratio for ρˆ2 . Since equation (3.20) does not provide a standard error for ρˆ2 , this is referred to its Bootstrap distribution to assess its significance. 4 Monte Carlo results The following illustrates typical Monte Carlo outcomes obtained via specific SARMA specifications. Qualitatively similar results are obtained using different assumptions but to save space these are given in the Appendix.7 The results in Tables 1 and 2 are typical8 of the outcomes. For simplicity I restrict the endogenous variable to the spatial lag and assume that W E = W , hence Y = λW Y + γ0 + γ1 H1 + γ2 H2 + u

(4.1)

7 They are also available from the author. 8 These are based on 100 samples, but much larger samples give very similar results, as shown in the

Appendix.

A generalized method of moments estimator

45

and u = (I − ρ2 W ) ξ = ξ − ρ2 ξ¯ , where ξ ∼ N (0, ω2 ). Exogenous variables H1 and H2 are draws from a uniform distribution with upper and lower bounds equal to 0 and 100. The n × n matrix W is a Rook’s case contiguity matrix on a 15 × 15 square, hence n = 225. Matrix W is standardized by dividing each row cell by its row total, so that the maximum and minimum eigenvalues are 1 and −1. The assumed parameter values are ω2 = 1, ρ2 = −0.5, λ = 0.75, γ0 = 1, γ1 = 10, γ2 = 10. For iteration k = 1, . . . ,100(0), sampling n times from ξ for each k, I obtain Yk = (I − λW )−1 (γ0 + γ1 H1k + γ2 H2k + (I − ρ2 W ) ξk )

(4.2)

Applying the three stage estimation method gives estimates λˆ k , γˆ0k , γˆ1k , γˆ2k , ρˆ2k and σˆ k2 . Table 1 summarises the parameter estimate distributions. It is evident that the parameter estimate means are close to the true values, suggesting that the estimators are unbiased, although there is more variance in the ρˆ2 distribution than for the other substantive parameters, and a suggestion of bias towards zero equal to about 0.25 of one standard deviation. Apart from σˆ k2 , none of the distributions differs significantly from normal, using the upper 5% point (14.07) of the χ72 distribution. The Appendix results are based on 1,000 Monte Carlo replications, using both Rook’s (edge touching) and Queen’s (edge and corner touching) definitions of contiguity on a n × n lattice, and also a torus, with opposite side of the lattice considered to be contiguous so as to eliminate edges. In addition, an irregular contiguity matrix is used based on the map of English unitary authority and local authority districts. Simulations provide a measure of bias and of a single indicator combining both precision (variance) and accuracy (bias), as given by a variant of the RMSE statistic (see Appendix). Most attention is focused on ρ2 through the range 0, −0.25, −0.5, −0.75, −0.95, since this equates to positive dependence, with λ = 0.5, and 0, but some results are also given for negative dependence, which on the whole mirror those for positive dependence. Also I set n = 25, 49, 81, 121, 169, 225 and 400. The results show clear evidence of small sample bias in ρˆ2 , which increases in ρ2 , and with positive bias in the case of positive dependence, and negative bias in the case of negative dependence, which means in both cases that the estimated parameter is closer to zero than the true value. However, it is also very apparent that as the sample size (n) increases, the bias in ρˆ2 diminishes and the RMSE also falls, suggesting consistency. From the theoretical perspective, this is not unexpected, since the formal proofs of consistency given by Kelejian and Prucha (1998, 1999) for spatially autoregressive disturbances might also imply that spatial moving average disturbances will also yield consistent estimates, given that the disturbances are consistently estimated as the first step, although a formal proof has yet to be given in the case of SARMA. In this paper I obtain parameter estimates for a model with an endogenous spatial lag using instrumental variables, leading to what appear to be consistent generalized moments estimators of ρ2 and σ 2 , but under a spatial moving average process. Apart from these large sample properties, it is apparent that ρˆ2 is likely to be effectively unbiased for many practical applications, for example there are in excess of 3,000 counties in the USA, and the results given in the Appendix indicate that the difference between the median of the Monte Carlo distribution and the true parameter value (ρ2 = −0.5) is of the order of 0.04 with n ≈ 400.

46

B. Fingleton

A valuable attribute of GMM is that there is no need to make distributional assumptions. To illustrate this, I introduce non-normal errors, meaning that I no longer sample from ξ ∼ N (0, ω2 ). To explore the consequences, I use the distributions described by Kelejian and Prucha (1999). There are two distributions to consider, one is a normalized version of the lognormal, which is asymmetric. The other is a normalized mixture of normals, which has thick tails. The normalized lognormal is ξ ∼

exp(ψ) − exp(0.5)

(exp(2) − exp(1))0.5

ψ ∼ N (0, 1)

(4.3)

Using the same parameter values as for Table 1, except with ρ2 = −0.2, gives the results in Table 2. The point estimate for ρˆ2k appear to be downwardly biased towards zero, again by about 0.25 of one standard deviation, and there is some evidence of non-normality particularly relating to σˆ k2 , but overall the estimates do not deviate significantly from their true values and on the whole a normal approximation holds. The mixture of normals is given by β + (1 − β) 5.950.5 Pr(β = 1) = 0.95 ξ ∼

∼ N (0, 1)

(4.4)

∼ N (0, 100) so that the Bernoulli distribution for β weights the mixture in favour of the N(0,1) distribution but takes 5% of random draws from the distribution. This gives the results on the right hand side of Table 2. These also suggest that the point estimates are robust to non-normality, although the mean of the ρˆ2k distribution is again about 10% below the true value, and a normal approximation seems appropriate for the distributions. One limitation of the proposed estimator is the absence of a standard error for ρˆ2 . To compensate for this, I obtain its Bootstrap distribution, which is assumed to be close to the null distribution of ρˆ2 when ρ2 = 0 is true. In order to indicate its significance, ρˆ2 should be an extreme observation with respect to this distribution, which as demonstrated above may not be normal, and the error distribution may be unknown. Here the Bootstrap distribution for ρˆ2 is provided by sampling at random with replacement from the residuals u, ˆ under the Table 1 assumptions, with the sample size equal to n and the probability of drawing uˆ i equal to 1/n. We assume that ρ2 = −0.2 but various different assumptions closely approximate the null distribution because of randomization.9 The exogenous variables H1 and H2 are each drawn from a uniform distribution with upper and lower bounds equal to 0 and 100, and the assumed error distribution is ξ ∼ N (0, ω2 ), but H1 , H2 and ξ remain fixed at the first draws so that the only cause 9 Although the true value of ρ is −0.2, randomization breaks up the spatial autocorrelation in the residuals 2

and gives moments similar to those obtained for ρ2 = 0.

48

B. Fingleton

in which housing demand (qj ) is positively related to income from local jobs(wl E l ), equal to the local wage rate (wl ) times the local employment level (E l ), and to wage and employment levels within commuting distance (wc E c ), and to the quality of local schooling (A). The Appendix gives more detail about these variables. Demand is negatively related to the price of housing ( p j ), and given that high prices drive down demand, it is assumed that high prices ‘nearby’ will cause demand otherwise attributable to nearby locations to be displaced, spilling over into j. We refer to this as a displaced demand effect. Hence it is assumed that demand at j will be positively related to the weighted average of prices in ‘surrounding’ areas, which is denoted by the matrix product W D p j , in which W D is a weights matrix appropriate to the demand function. Other unmodelled factors such as demand coming from non-wage earners such as the retired and students, and the effects of criminality, social quality of the neighbourhood, amenity, local taxes, etc., are represented by a stochastic error ω ∼ iid(0, 2 I ). The supply function is q j = b0 + b1 p j + b2 O j − ηW S p j + ς

(5.2)

which assumes, ceteris paribus, that the level of housing supply q j increases in the price at j. For example property owners in high price areas may be more likely to want to realise the value of their assets by offering to sell; in contrast in low price locations home owners may prefer to withhold their properties from the market. Likewise property developers will be attracted to areas with high prices, and by the same token it is also assumed that high prices ‘nearby’ (W S p j ) will attract supply away from j, hence the negative sign for η, in which matrix W S is the weights matrix appropriate to the supply function. This is referred to as a displaced supply effect. In addition, controlling for price effects, supply also is assumed to relate to the size of the existing stock of properties (O) and other unmodelled variables are represented by ς ∼ iid(0, 2 I ). The reduced form is obtained by normalizing the supply function with respect to p, thus pj =

1 b0 b2 η ς qj − − Oj − W S pj + b1 b1 b1 b1 b1

(5.3)

and substituting for q gives

p j = c1 [a0 + a1 wlj E lj + a2 w cj E cj + a3 A j − a4 p j + νW D p j + ω] − c0 − c2 O j − c3 W S p j + ξ

Simplifying, and assuming W E = W D = W S , one obtains pj = λ

k= j

E pk + d0 + d1 wlj E lj + d2 w cj E cj + d3 A j + d4 O j + ε j W jk

(5.4)

A generalized method of moments estimator

49

which is the well-known spatial lag model (Ord 1975; Cliff and Ord 1981; Upton and Fingleton 1985; Anselin 1988a,b). Written in matrix terms this is p = λW E p + X b + ε ε ∼ N (0, σ 2 I )

(5.5)

In Eq. (5.5), p is the n × 1 vector of prices, X is an n × k matrix of k-1 exogenous variables (wl E l , w c E c , A, O) with the first column a column of 1s, scalar λ is the spatial lag coefficient and b is the k × 1 vector of exogenous variable coefficients. To facilitate ML estimation, the vector of residuals, ε is here assumed to be normally distribution with mean zero and constant variance σ 2 . The matrix W E is assumed to be relative ‘economic distance’ based on a negative exponential function of straight line distance di j (in miles) between areas i and j, and on the size of each area’s economy (E l ) measured in terms of the total employment level in 1999 (in units of 1,000), thus l l WiE∗ j = E i E j exp(−βdi j )

(5.6)

The coefficient β determining the rate of distance decay is set to the value 100. This is then row standardised giving the asymmetric matrix WiE∗ j WiEj = E∗ W j ij

(5.7)

In this ‘relative economic distance’ allows the quantity of demand and supply that is displaced to depend on economic mass as well as distance. Table 3 shows the resulting ML and 2sls estimates12 and highlights the fact that, despite the presence of the spatial lag, there is significant residual autocorrelation. This is detected using a row standardized contiguity matrix (with 1s present when two UALADs share a common boundary, otherwise 0s). The LM test (Anselin 1988a,b) indicates highly significant residual autocorrelation, with the test statistic having an upper probability equal to 0.00000001 in the chi-squared distribution with one degree of freedom. The corresponding Z score indicates that the residual autocorrelation is positive. For the corresponding 2sls residuals, the standardized value of the Anselin and Kelejian (1997) statistic equals 4.740; this shows a similar high level of positive residual autocorrelation involving contiguous areas. It is clear that there ought to be spatial autocorrelation even in the presence of the spatial lag if the spatial lag is principally a net displacement effect dependent on both economic mass and distance. There are many other variables that one might introduce, such as air quality and neighbourhood quality Anselin (2003), which could turn out 12 Unreported Bootstrap estimates, which may be more robust to error non-normality and heteroscedasticity, are similar. To obtain the 2sls estimates I regress W E p on the exogenous variables wl E l , w c E c , A, O

and their first spatial lags, except for the lag of A which has been omitted to achieve full column rank in the matrix of instruments. Kelejian and Robinson (1993), Kelejian and Prucha (1998) also suggest excluding high order spatial lags to avoid linear dependence.

50

B. Fingleton

Table 3 ML and 2sls estimates for the house price data ML

2sls

Parameter

Par. Est.

Constant

b0

WE p

−704.0501

λ

0.7233

wl E l

b1

0.7162

9.64

0.7166

wc E c

b2

0.0307

7.40

0.0308

O

b3

A

b4

−0.0005

185.0888

σ

35.8172

36.1263

Df

347

347

LM(χ12 )

31.30

Z

5.595

Loglikelihood

−1764.8763

Akaike

−9.25

11.61

−5.47

9.57

−703.6700 0.7212

−0.0005

185.0621

−9.17

10.78 9.52 7.15

−5.41

9.49

4.757

3541.7526

to be omitted spatially autocorrelated variables, the net effect of which is to induce an organised residual pattern (Dubin 1988; Brueckner 2003). Among possible additional variables that one might consider, there are some such as the nature of the housing stock, planning and building regulations, vulnerability to flooding and therefore the additional insurance premiums for areas on flood plains, crime, social, demographic, labour market and cultural differences, and other environmental factors such as air pollution and noise, and so on, that are also likely to be spatially autocorrelated. While displaced demand or supply may cascade outwards in an autoregressive process, I assume no such chain reaction for these variables, so that a shock, on its own, has a limited spatial footprint. I represent these unmodelled effects by the moving average process, leading to the SARMA specification defined as p = λW E p + X b + e u = (I − ρ2 W ) ξ

(5.8)

2

ξ ∼ iid(0, σ I )

Table 4 gives the results of estimating this model via GMM, the negative moving average parameter signifying positive contributions to the residuals from contiguous errors. The Bootstrap distribution for ρˆ2 based on 99 random samples with replacement from the residuals uˆ gives a Bootstrap estimate equal to −0.003907 and Bootstrap variance of 0.008523, so that ρˆ2 = −0.4883 is an extreme observation with respect to its Bootstrap distribution, ranking below any of the 99 reference values, suggesting a significant moving average error process.

A generalized method of moments estimator

51

Table 4 GMM estimates for the SARMA model

GMM (MA) Parameter

Par. Est.

Constant

b0

WE p

−671.4766

λ

0.7429

wl E l

b1

0.5902

wc E c

b2

0.0320

O

b3

A

b4

−0.0005

175.7372

ρ2

−0.4883

MA

σ Df

t-ratio −8.76

8.22

6.90 6.02 −4.54

9.08

34.8145 346

6 Conclusion The paper presents a new GMM estimator for models with spatial moving average errors, and applies it to real estate price data. While Further Monte Carlo exploration would add to our knowledge about the distributional properties of the estimators, there is also a need in future work to provide formal proof of the consistency of the estimators. The indication from the Monte Carlo results is that the method does produce consistent estimates, although there is small sample positive bias in ρˆ2 . Also, the estimator is evidently reasonably robust to non-normality of disturbances. The paper advocates Bootstrap methods to allow inferences about the significance of the MA parameter. The paper shows that the method can be applied in practice by fitting a SARMA model to real estate data for small areas, providing evidence that house price levels depend on the level of income both locally and within commuting distance, on local school quality, and on the stock of properties within the area. In addition, there is a significant effect attributable to the spatial lag of prices, which is seen as the net outcome of the displacement of demand and of supply between areas. However, this does not eliminate spatial autocorrelation attributable to unmodelled effects, and these are captured by the spatial moving average process. The implication for shocks is that they will not be restricted as under a purely MA process because of the confounding effect of the SAR process. Under the model the wider transmission of shocks can be attributed to the displacement effects leading to the direct price interaction across areas in the reduced form, rather than to the spillover of shock effects per se. Acknowledgments This paper was initially presented at the International Workshop on Spatial Econometrics and Statistics, 25–27 May, 2006, Rome, Italy. I am grateful to Michael Pfaffermayr and the other participants for their contribution to the workshop discussion of this paper. I would also like to thank the referees and Editor for their advice and comments.

Appendx: A. Data The dependent variable p is the mean transaction price (all types of residential property) by area for the period July–Sept 2001 for the n = 353 English Unitary Authority

52

B. Fingleton

Table 5 Commuting distances to work in England and Wales Distance

<2km

2–4

5–9

10–19

20–29

30–39

>40km

%

26.63

25.28

20.93

15.90

4.96

2.05

4.25

100

and Local Authority Districts13 (UALADs). The data were provided by the Land Registry. The wage rate (wl ) is the gross weekly pay for all occupations and both males and females taken from the Office for National Statistics’14 (ONS) New Earnings Survey. The employment level for the year 2000 is based on the annual business enquiry employee analysis, also carried out by the ONS and available on the NOMIS database. Total earnings in an area is the product of the average wage rate (wl ) and the total level of employment in 2000, denoted by wl E l . This is assumed to be predetermined with respect to 2001 price levels. The vector wc E c denotes total earnings within commuting distance of a UALAD. This is equal to the matrix product of the n × n matrix C and the n × 1 vector wl E l . The matrix C is defined as follows Ci j = exp(−δi di j ) i = j Ci j = 0 i = j

Ci j = 0 di j > 100 km Cell (i, j) of the C matrix is a function of the (straight line) distance (di j ) between areas i and j and an area specific coefficient δi . This allows for the different levels of transport infrastructure and commuting in different areas, with the choice of exponent δi based on empirical comparisons with observed census data15 on travel-to-work patterns. Table 5 shows the overall proportion of workers16 living in England and Wales travelling various distances from home to work. Given observed travel percentages comparable to Table 5 for each area, the exponent δi for each area was chosen by iterating the function exp(−δi di j ) through a range of values to obtain the value giving the closest fit17 to each area’s commuting data. The quality of local schooling A has been shown to be significantly associated with higher residential property prices (see for example Leech and Campos 2003; Cheshire and Sheppard 2004; Gibbons and Machin 2003). The variable A is based on the results of the 1998 key stage 2 tests taken by 11-year-old pupils. These results, collated by Oxford University for the ONS, have been converted into an indicator of the quality 13 Small administrative areas, with median area equal to 250.77 sq. km. 14 Available on the NOMIS website (the ONS on-line labour market statistics database). 15 1991 Census of Population - Special Workplace Statistics, available from NOMIS. 16 Total employees and self-employed with a workplace coded, tabulated by residents in each zone (10%

sample). 17 Minimum of the sum of the squared deviations of the observed proportions in each distance band up to

40 km and the proportions of the sum of the function exp (−δi di j ) calculated using the upper limit of each distance band.

A generalized method of moments estimator

53

Table 6 Bias and RMSE with increasing positive dependence (ω2 = 1, λ = 0.5, γ0 = 1, γ1 = 10, γ2 = 10) ρ2 = 0

ρ2 = −0.25

ρ2 = −0.5

ρ2 = −0.75

ρ2 = −0.95

Bias −0.03939

−0.05180

−0.06411

−0.06263

−0.05949

−0.03051

−0.00369

−0.00061

0.01321

0.00615

0.00456

0.01227

0.02726

0.06305

0.08007

0.13611

0.18186

γ0

1.105

0.8463

0.9546

1.051

1.118

λ

0.05009

0.04305

0.04139

0.04789

0.04739

γ0 λ γ1 γ2 ρ2

0.00200

0.01438

0.00200

0.00530

0.00043

0.00270

−0.00202

0.00305

RMSE

γ1

0.4183

0.4401

0.3184

0.3837

0.3669

γ2

0.3762

0.3805

0.4214

0.3290

0.3153

ρ2

0.1672

0.1621

0.1928

0.2380

0.2932

The matrix W is a Rook’s case contiguity matrix for a 9×9 square, hence the number of locations is n = 81, and W is of dimension 81 × 81

of schooling in each UALAD. Commencing with 8,413 (English) Wards each with an A (mean) score, I obtain the mean A score for the 353 English UALADs (there are on average 24 Wards per UALAD). The A variable pre-dates the house price variable and so for the purposes of estimation is predetermined. House prices are also assumed to depend on a measure of the stock of properties. This is represented by the number of owner-occupier households (O) reported in the 1991 Census of Population, Local Base Statistics, Table L20 Tenure and amenities: Households with residents; residents in households. This is available in the NOMIS database. These are predetermined with respect to ‘current’ prices. The results of the Monte-Carlo investigation are given in Tables 6, 7, 8, 9, 10, 11, 12, 13, and 14. Appendix B: Monte Carlo investigation

Y = λW Y + γ0 + γ1 H1 + γ2 H2 + u and u = (I − ρ2 W ) ξ = ξ − ρ2 ξ¯ , where ξ ∼ N (0, ω2 ). The exogenous variables H1 and H2 are draws from a uniform distribution with upper and lower bounds equal to 0 and 1. Bias = median − true parameter value

RMSE = bias 2 +

IQ 1.35

2 0.5

54

B. Fingleton

Table 7 Bias and RMSE with increasing negative dependence (ω2 = 1, λ = 0.5, γ0 = 1, γ1 = 10, γ2 = 10) ρ2 = 0

ρ2 = 0.25

ρ2 = 0.5

ρ2 = 0.75

−0.04359

−0.034565

−0.014012

−0.04048

−0.01407

−0.009049

−0.003599

−0.01565

0.02369

0.003091

−0.015776

−0.03566

ρ2 = 0.95

Bias γ0 λ γ1 γ2 ρ2

0.00275

0.001594

−0.00131

−0.000116

RMSE

0.001342

−0.008423

0.00199

−0.01088

−0.02359 0.00072

0.00795 −0.00893

−0.07260

γ0

0.7556

0.6521

0.5302

0.4298

0.3585

λ

0.03675

0.03469

0.03212

0.03062

0.03164

γ1

0.3476

0.3632

0.3976

0.4137

0.4320

γ2

0.4368

0.4478

0.4643

0.4763

0.4844

ρ2

0.1542

0.1585

0.1812

0.2316

0.3210

The matrix W is a Rook’s case contiguity matrix for a 9×9 square, hence the number of locations is n = 81, and W is of dimension 81 × 81

Table 8 Bias and RMSE with increasing lattice size, with positive dependence (ω2 = 1, λ = 0.5, γ0 = 1, γ1 = 10, γ2 = 10, ρ2 = −0.5) Lattice size

5×5

7×7

9×9

11 × 11

13 × 13

−0.2368

−0.13580

−0.05131

−0.05078

−0.03191

−0.0315

0.01304

−0.00155

−0.01711

−0.01573

0.2739

0.13051

0.09159

0.05433

0.04499

Bias γ0 λ γ1 γ2 ρ2

0.0073

−0.0438

0.00551

0.00028

0.00278

0.00328

0.00124

0.02429

0.00236

−0.00780

RMSE γ0

1.886

1.349

0.9417

0.9575

λ

0.08244

0.06069

0.04393

0.04248

0.8380 0.03541

γ1

0.7594

0.4242

0.4141

0.2984

0.2538

γ2

0.6625

0.4811

0.3529

0.2892

0.2589

ρ2

0.3863

0.2466

0.1884

0.1673

0.1410

The matrix W is a Rook’s case contiguity matrix, n = 25, 49, 81, 121, 169

IQ is the interquantile range, equal to the difference between the 0.75 and 0.25 quantile. While this approximation is based in IQ rather than the variance, under normality the median is equal to the mean and apart from slight rounding IQ/1.35 is the standard deviation, so this measure reduces to the standard RMSE statistic (see Kapoor et al. 2007).

A generalized method of moments estimator

55

Table 9 Bias and RMSE with increasing lattice size, with negative dependence (ω2 = 1, λ = 0.5, γ0 = 1, γ1 = 10, γ2 = 10, ρ2 = 0.5) Lattice size

5×5

7×7

9×9

11 × 11

13 × 13

Bias γ0 λ γ1 γ2 ρ2 RMSE

0.01140

0.008986

0.01572

−0.000547

−0.00058

−0.008423

−0.008858

−0.00338

0.5369

−0.00354

−0.014012

−0.04602

−0.00281

−0.003599

−0.02613

−0.02669

−0.015776

−0.00093

−0.03493

0.00102

0.001342

−0.04155

0.007480

−0.004339

−0.01216

−0.01978

γ0

2.041

1.022

0.5302

0.4672

λ

0.1024

0.05315

0.03212

0.02859

0.02825

γ1

0.8913

0.5116

0.3976

0.3519

0.3031

γ2

0.7220

0.5154

0.4643

0.3666

0.2698

ρ2

0.3308

0.2217

0.1812

0.1591

0.1377

The matrix W is a Rook’s case contiguity matrix, n = 25, 49, 81, 121, 169

Table 10 Bias with increasing positive dependence, zero endogenous lag (ω2 = 1, λ = 0, γ0 = 1, γ1 = 10, γ2 = 10) Bias

ρ2 = 0

γ0

0.009549

λ

0.000857

γ1

ρ2 = −0.25

ρ2 = −0.5

ρ2 = −0.75

ρ2 = −0.95

−0.01349

−0.03156

0.02958

0.00325

0.00001

0.002693

0.00450

0.01544

0.01128

0.02087

γ2

0.001488

0.00323

0.01373

0.01448

0.00765

ρ2

0.017635

0.04004

0.07117

0.09901

0.12915

0.00164

−0.00863

0.00285

The matrix W is a Rook’s case contiguity matrix for a 9×9 square, hence the number of locations is n = 81, and W is of dimension 81 × 81

Table 11 Bias with positive dependence, W based on English local authority areas (ω2 = 1, λ = 0.5, γ0 = 1, γ1 = 10, γ2 = 10) Bias

ρ2 = −0.5

ρ2 = −0.75

−0.02116

−0.05811

γ1

0.00368

γ2

−0.00821

−0.00680

γ0 λ

ρ2

0.00078

0.03848

0.00282

−0.00746

0.07116

The W is a contiguity matrix for English unitary authority and local authority districts, hence n = 353 and W is of dimension 383 × 383

56

B. Fingleton

Table 12 Bias with increasing positive dependence, W based Queen’s case contiguity matrix (ω2 = 1, λ = 0.5, γ0 = 1, γ1 = 10, γ2 = 10) Bias γ0 λ γ1 γ2 ρ2

ρ2 = 0

ρ2 = −0.25

ρ2 = −0.5

ρ2 = −0.75

ρ2 = −0.95

−0.05545

−0.09031

−0.12330

−0.15041

−0.19883

−0.01710

−0.01504

−0.01599

0.00025

0.00947

0.08621

0.11039

0.14528

−0.01482

−0.00751

0.00453

0.00547

−0.00437

0.00666

−0.00635

0.01013

−0.00039

0.01262

0.18766

0.22444

The matrix W is a Queen’s case contiguity matrix for a 9 × 9 square, hence the number of locations is n = 81, and W is of dimension 81 × 81 Table 13 Bias with increasing lattice size, with positive dependence and W based Queen’s case contiguity matrix (ω2 = 1, λ = 0.5, γ0 = 1, γ1 = 10, γ2 = 10, ρ2 = −0.5) Lattice size

5×5

7×7

−1.3587

−0.4148

−0.0685

0.0147

9×9

11 × 11

13 × 13

15 × 15

20 × 20

−0.12330

−0.05099

−0.13849

−0.05895

−0.02488

−0.01599

−0.00026

−0.01288

0.01715

−0.00477

0.14528

0.11147

0.07805

Bias γ0 λ γ1 γ2 ρ2

0.0491

0.0288 0.4251

0.0184

−0.0357

0.2503

0.00666

−0.00039

0.00565

−0.00778

0.00673

−0.00611

0.00250

−0.01167

0.06296

0.00225

0.00098 0.03917

The matrix W is a Queen’s case contiguity matrix, n = 25, 49, 81, 121, 169, 225 and 400 Table 14 Bias with increasing lattice size, with positive dependence and W based on a torus (ω2 = 1, λ = 0.5, γ0 = 1, γ1 = 10, γ2 = 10, ρ2 = −0.5) Lattice size

5×5

7×7

9×9

11 × 11

13 × 13

Bias γ0 λ γ1 γ2 ρ2

−0.17578

−0.09959

−0.03626

−0.01696

−0.10304

−0.00836

0.01238

0.01888

0.01372

−0.03448

−0.01573

−0.00789

0.00442

−0.03065

0.26219

0.00253

0.14833

0.00476

0.08906

−0.00002

−0.01300

0.06774

0.00343 0.00536

0.03818

The matrix W is a Rook’s case contiguity matrix based on a torus (with opposite edges of the lattice considered to be contiguous), n = 25, 49, 81, 121, 169

In all cases W is normalised to row totals equal to 1 and the bias and RMSE are based on 1,000 Monte Carlo replications with n = 25, 49, 81, 121, 169, 225 and 400. References Anselin L (1988a) Lagrange multiplier test diagnostics for spatial dependence and spatial heterogeneity. Geogr Anal 20:1–17

A generalized method of moments estimator

57

Anselin L (1988b) Spatial econometrics: methods and models. Kluwer, Dordrecht Anselin L (2003) Spatial externalities, spatial multipliers, and spatial econometrics. Int Reg Sci Rev 26: 153–166 Anselin L, Kelejian HH (1997) Testing for spatial error autocorrelation in the presence of endogenous regressors. Int Reg Sci Rev 20:153–182 Anselin L, Florax R (1995) New directions in spatial econometrics. Springer, London Anselin L, Bera A (1998) Spatial dependence in linear regression models with an introduction to spatial econometrics. In: Ullah A, Giles DE (eds) Handbook of applied economic statistics. Marcel Dekker, New York Brueckner JK (2003) Strategic interaction among governments: an overview of empirical studies. Int Reg Sci Rev 26:175–188 Cheshire P, Sheppard S (2004) Capitalising the value of free schools: the impact of supply characteristics and uncertainty. Econ J 114:F397–F424 Cliff AD, Ord JK (1981) Spatial processes : models and applications. Pion, London Dubin RA (1988) Estimation of regression coefficients in the presence of spatially autocorrelated error terms. Rev Econ Stat 70:466–474 Gibbons S, Machin S (2003) Valuing english primary schools. J Urban Econ 53:197–219 Haining RP (1978) The moving average model for spatial interaction. Trans Inst Br Geogr 3:202–225 Kapoor M, Kelejian HH, Prucha I (2007) Panel data models with spatially correlated error components. J Econom (forthcoming) Kelejian HH, Robinson DP (1993) A suggested method of estimation for spatial interdependent models with autocorrelated errors, and an application to a county expenditure model. Papers Reg Sci 72:297–312 Kelejian HH, Prucha IR (1998) A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. J Real Estate Financ Econ 17:99–121 Kelejian HH, Prucha IR (1999) A generalized moments estimator for the autoregressive parameter in a spatial model. Int Econ Rev 40:509–533 Kelejian HH, Prucha IR (2004) Estimation of simultaneous systems of spatially interrelated cross sectional equations. J Econom 118:27–50 Leech D, Campos E (2003) Is comprehensive education really free? : a case-study of the effects of secondary school admissions policies on house prices in one local area. J Roy Stat Soc A 166:135–154 Ord JK (1975) Estimation methods for models of spatial interaction. J Am Stat Assoc 70:120–126 Upton GJG, Fingleton B (1985) Spatial data analysis by example, vol 1. Wiley, Chichester

Spatial analysis of urban growth in Spain, 1900–2001 Julie Le Gallo · Coro Chasco

Abstract The purpose of this paper is to improve the knowledge of the Spanish urban system. We study the evolution of population growth among the group of 722 municipalities included in the Spanish urban areas over the period 1900–2001. A spatial SUR model is estimated for Zipf’s law and shows the existence of two main phases: divergence (1900–1980) and convergence (1980–2001). Then, the cross-sectional distribution of urban population is characterized by means of nonparametric estimations of density functions and the growth process is modeled as a first-order stationary Markov chain. Spatial effects are finally introduced within the Markov chain framework using regional conditioning. This analysis shows a low interclass mobility, i.e., a high-persistence of urban municipalities to stay in their own class from one decade to another over the whole period, and the influence of the geographical environment on urban population dynamism.

Previous versions of this paper were presented at the 45th Congress of the European Regional Science Association (Vrije Universiteit, Amsterdam, Netherlands, August 23–27, 2005) and at the International Workshop on Spatial Econometrics and Statistics (Rome, Italy, May 25–27, 2006). We would like to thank two anonymous referees, M. Bosker, P. Cheshire, A. Carrington, B. Fingleton, R. Guillain, E. Lopez-Bazo, J. Paelinck and the other participants of these meetings for their valuable comments. Coro Chasco acknowledges financial support from the Spanish Ministry of Education and Science SEJ2006-14277-C04-01. The usual disclaimers apply. J. Le Gallo (B) CRESE, Université de Franche-Comté, 45D, avenue de l’Observatoire, 25030 Besancon Cedex, France e-mail: [email protected] C. Chasco Departamento de Economía Aplicada, Universidad Autónoma de Madrid, 28049 Madrid, Spain e-mail: [email protected]

60

J. Le Gallo, C. Chasco

Keywords Convergence · Urban growth · Spatial autocorrelation · Spatial SUR models · Markov chains JEL Classifications C14 · C21 · O18 1 Introduction Economic development is associated with the movement of population from the countryside to cities. This observation raises the question of how cities of different sizes grow during the process of development. The size distribution of cities may become more even over time if smaller cities catch up with larger ones. At the other extreme, urbanization may take the form of the expansion of the largest cities. In this case, the size distribution would become more unequal. In this paper, we consider the particular experience of Spain between 1900 and 2001. Our aim is thus to improve the knowledge of the Spanish urban system and answer the following questions. How has the size distribution evolved over the last century? Has it become more even or more unequal? Is there a lot of mobility of cities within this size distribution? These questions are particularly relevant since the Spanish urbanization process has mainly taken place during the twentieth century producing significant processes of industrialization and economic growth. Specifically in Spain, this process has not been uniform and different results may be found depending on the definition of an “urban area”. In fact, there is no official definition of an “urban area” in Spain and it is not easy to obtain statistical data at the level of municipalities. Hence, analyses of the Spanish urban system are still scarce. Nevertheless, some authors have considered the group of “main cities” -above 50,000 inhabitants—as urban units (e.g. Lanaspa et al. 2003, 2004, Mella and Chasco 2006). In this paper, we propose to work with the set of municipalities that form the Spanish “urban areas”, as defined by the Ministerio de Fomento (2000). It is a heterogeneous group of municipalites that not only includes the main cities but also all the satellite towns that make up the complete metropolitan area. We study the evolution of population growth among this set of 722 municipalities included in the present Spanish urban areas over the period 1900–2001. In order to examine urban evolution and answer the preceding questions, we first examine the city size distribution by centering on the question of whether Zipf’s law or its deterministic equivalent, the rank-size rule, holds for Spanish cities. Zipf’s law has been applied in numerous studies (see Gabaix and Ioannides 2004, for a recent review) but none of them considers the possibility of spatial effects. However, due to the geographical nature of the empirical data used, we emphasize in this paper the need to pay attention to the appropriate econometric methodology needed for a reliable statistical inference. Therefore, we formally test for spatial autocorrelation and spatial heteroskedasticity in the spatial SUR framework suggested by Anselin (1988). This empirical work on the rank-size rule is essentially involved with one particular characteristic of the distribution of city sizes: the shape of that distribution. However, some papers have also paid attention to the intra-distribution dynamics (Eaton and Eckstein 1997; Black and Henderson 1999, 2003; Lanaspa et al. 2004). Indeed, as

Spatial analysis of urban growth in Spain, 1900–2001

61

Quah (1996) has forcefully argued, typical cross-section or panel data econometric techniques do not allow inference about patterns in the intertemporal evolution of the entire cross-sectional distribution. Making such inferences requires estimating directly the full dynamics of the entire distribution of cities. We therefore follow this strand of literature by focusing on how cities develop relatively to the rest of the urban system, both in terms of rankings and relative sizes. For that purpose, the cross-sectional distribution of urban population is analyzed by means of nonparametric estimations of density functions and the growth process is modeled as a first-order stationary Markov chain. The evolution of the shape of the population cross-sectional distribution and the changes in the municipalities’ relative positions within this distribution is then able to uncover the existence of alternate divergence/convergence trends. Moreover, as in the analysis of Zipf’s law, we also adopt an explicit spatial approach by measuring the extent to which the geographical environment influences the urban municipalities’ relative position within the population cross-sectional distribution. Hence, we extend previous studies focusing on the Spanish case (mainly Lanaspa et al. 2003, 2004) in two ways. First, not only do we consider a broader set of cities, but we also analyze the complete set of municipalities that actually belong to the Spanish urban areas: bigger cities and metropolitan towns. This feature allows for a better knowledge of the evolution of the complete urban system in Spain, which is not exactly the same as the one experienced by other groups of cities in the same country. Second, we explicitly introduce spatial dependence specifications and tests in both Zipf’s law and Markov Chains analyses to capture the influence of space on convergence and transition probabilities and we perform a more complete analysis of movement speed and form of convergence in the city size distribution. The paper is organized as follows. In the first section, the evolution of the disparities between the Spanish urban municipalities is characterized by examining the population cross-sectional distribution over the period from 1900 to 2001. We also test the validity of Zipf’s law over this period. In the second section, we estimate a first-order Markov chain and analyze its ergodic properties. The article concludes with a summary of key findings. 2 The evolution of the Spanish urban system 1900–2001 This section examines growth in the Spanish urban system and changes in the relative size distribution of urban municipalities over a 100-year period. 2.1 Data In order to explore these issues, we need a data set with urban areas defined consistently over the century. For that purpose, we have considered the classification proposed by the Spanish Ministry of Urbanism and Public Works (Ministerio de Fomento 2000). It divides the Spanish territory into urban areas, which include a set of 722 municipalities: (1) a set of 495 towns included in the 65 “Large Urban Areas” (areas above 50000 inhabitants each); (2) the group of 227 municipalities considered as “Small Urban Areas” (towns above 10000 inhabitants not included in the Large Urban Areas, with

62

J. Le Gallo, C. Chasco

minor corrections). This is rather different from the approach in Lanaspa et al. (2003), who operated with a sample of 100 largest cities1 as proxies of the Spanish urban system. Moreover, these 722 urban settlements are located across the whole Spanish territory: Andalusia (137), Aragón (10), Asturias (25), Balearics (13), Canary Islands (29), Cantabria (5), Castille and León (23), Castille-La Mancha (17), Catalonia (192), Valencian Community (96), Extremadura (11), Galicia (32), Madrid (33), Murcia (19), Navarre (18), Bask Country (58), Rioja (2), Ceuta and Melilla (2). The evolution of population distribution is analyzed using the Census data over the period from 1900 to 2001. There are 11 decades under consideration: 1900, 1910, 1920, 1930, 1940, 1950, 1960, 1970, 1980, 1991 and 2001. The data on population are extracted from the Spanish Office for Statistics (INE) databank.2

2.2 The evolution of the shape of urban population distribution Using this dataset, we first analyze the evolution of the shape of urban size. For that purpose, we estimate non-parametric kernel density estimates of the urban population distribution for each decade and we analyze its monomodality or multimodality characteristics. More precisely, we examine relative urban municipality size distribution in 1900 and the way this distribution has changed over time in 1950, 1970 until 2001. Relative size distributions are considered, where size for each decade is normalized by dividing by the average municipal area size. Figure 1 shows the relative log urban municipality size distributions in 1900, 1950, 1970 and 2001. This density plot may be interpreted as the continuous equivalent of a histogram in which the number of intervals has been set to infinity and then to the continuum. From the definition of the data, 1 on the horizontal axis indicates Spanish average city size, 2 indicates twice this average, and so on. Figure 1 plots an interesting graph where the distribution is bimodal in 1900 but becomes a unimodal distribution in 2001 (minor mode around 70–80%). This may reflect the existence, in 1900, of a group of urban municipalities with sizes below the average, converging toward a lower population level than the rest of the towns. Compared with 1900, more urban municipalities reported in 2001 population about the Spanish average. The distributions in 1900 and 1950 are quite similar, while the central mass significantly increased in 1970 to reach the highest point in the 2001 distribution. This progressive concentration of probability mass around 100% can be interpreted as evidence for slight convergence. This result is similar to others in the literature (Lanaspa et al. 2003, 2004, for Spanish largest cities; Anderson and Ge 2005, for Chinese cities), though differs from Black and Henderson (2003) results for US metropolitan areas. 1 These authors chose a relatively arbitrary number of “largest cities” after finding that the results were

qualitatively robust to different sample sizes. In a posterior paper, Lanaspa et al. (2004) chose different subsets with the 100, 300, 500 and 700 most-populated municipalities. 2 This data are available in the INE webpage: http://www.ine.es.

Spatial analysis of urban growth in Spain, 1900–2001

63

4.5

1900 1950 1970 2001

4 3.5 3 2.5 2 1.5 1 0.5 0 0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Fig. 1 Densities of log relative urban municipality size

2.3 Zipf’s law, or the rank-size rule We continue our exploration of the evolution of the Spanish urban municipality size distribution by using Zipf’s law, or the rank-size rule. Zipf (1949) claimed that the size distribution of cities follows a Pareto law (Pareto 1897) when it holds that: R = a · S −b

(1)

where R is the city rank order of the population distribution; S is the population of the cities; and a and b are parameters, with the latter being the Pareto exponent, always positive by construction. The rank size rule, which emerged from regularly observed features of the data lacking any economic theoretic foundation, has recently been analyzed, among others, by Krugman (1996), Eaton and Eckstein (1997), Overman and Ioannides (2001), Dobkins and Ioannides (2000), Davis and Weinstein (2002), Ioannides and Overman (2003), Cordoba (2003), Rossi-Hansberg and Wright (2004), Gabaix and Ioannides (2004), Soo (2005), Gabaix and Ibragimov (2006). Gabaix (1999a,b) has derived a statistical explanation of Zipf’s law for cities. He shows that if different cities grow randomly with the same expected growth rate and the same variance, the limit distribution of city size will converge so as to obey Zipf’s law. Duranton (2006) provides some economic foundations for Zipf’s law: he embeds the endogenous growth model suggested by Grossman and Helpman (1991) into an urban framework and views investments in R&D as the main driver of city growth. This model can then generate Zipf’s law under some particular condition detailed in the paper. Finally, Nitsch (2005) carries out a meta-analysis combining 515 estimates from 29 studies and finds that cities are on average more evenly distributed than suggested by Zipf’s law. Formally, in this framework, the size distribution of cities is more or less even, depending on the value of the Pareto exponent (b). At the limit, if b tends to infinity, then

64

J. Le Gallo, C. Chasco

all the cities will be of an equal size. When b is equal to one, we obtain the well-known rank-size rule or Zipf’s law. According to this rule, city populations among any group of cities at any time are proportional to the inverse of the ranking of their populations in that group. The Pareto exponent can therefore be interpreted as a convergence indicator. Indeed, values that fall over time indicate relatively more important roles (increasing weights) for the largest cities. More precisely, as b decreases, a 1% increase in city size produces a smaller fall (in %) in rank and the city size distribution becomes more spread out. Therefore, this will cause a divergence trend inside the group of urban municipalities or greater metropolitan concentration. Likewise, a 1% increase in city size produces a larger fall (in %) in rank as b increases. Therefore, increasing values of the Pareto exponent represent convergence dynamics, or in other words, greater dispersion of the population outside the large metropolitan areas and a more balanced population distribution between urban centers of different sizes. Empirically, departing from Eq. (1), we take logarithms on both sides and estimate the resulting linear expression for the set of 722 urban municipalities (i = 1, . . . 722) for each of the eleven decades (t = 1, . . . 11) under consideration: ln Rit = ln at − bt · ln Sit + εit

(2)

Gabaix and Ioannides (2004) have shown by Monte–Carlo simulations that OLS estimation of this equation presents several pitfalls in small samples.3 Moreover, OLS may be affected by the omission of spatial autocorrelation. More precisely, if spatially autocorrelated residuals represent the effects of an unmodelled spatially autoregressive error process, then the parameter estimates remain unbiased but become inefficient. Statistical inference is biased in this case. Conversely, if they are due to the omission of spatial autocorrelated variables, then the parameter estimates are biased. Therefore, and since we have allowed for different intercepts and slopes in each period, we have followed the strategy suggested by Anselin (1988, pp. 203) for the specification of spatial SUR models. In a first stage, we have estimated Eq. (2) by Ordinary Least Squares (OLS) for the 11 spatial equations individually considered (i.e., one equation for each decade under consideration). For each model, we have tested for the presence of spatial effects. As shown in Table 1, the OLS residuals of the 11 equations are non-normal and exhibit both heteroskedasticity and spatial autocorrelation (as pointed out by Jarque–Bera, Koenker–Basset and Kelejian–Robinson tests, respectively). Therefore, we can conclude that both spatial effects (spatial autocorrelation and spatial heterogeneity characterized by heteroskedasticity) are present in the 11 models. We will focus in the remaining discussion on the former problem, leaving an explicit treatment of heteroskedasticity problems for further research. Regarding spatial autocorrelation, the non-normality of the error terms implies that the 3 Gabaix and Ibragimov (2006) also point out that the OLS estimator of b in equation (2) is strongly biased in small samples. To overcome this problem, they provide a simple practical remedy and show, by MonteCarlo simulations, that this bias is considerably reduced when using the Rank −1/2, and run log (Rank −1/2) = α − b log (Size). Although we do not have a small sample and we use estimation methods other than OLS, we have also estimated equation (2) introducing this change. As expected, the results obtained are robust to the modification suggested by Gabaix and Ibragimov, and they are available upon request from the authors.

OLS Basic model

Spatial SUR model (ML)

Spatial SUR spatial-error model (ML)

Spatial SUR spatial lag model (ML)

αˆ

bˆ

JB

KB

KR

αˆ

bˆ

αˆ

bˆ

λˆ

1900

11.04

116

188

10.66

10.43

111

249

10.65

−0.62

0.16

748

−0.62

10.68

11.06

−0.66

728

1910

0.19

10.29

1920

11.04

794

100

287

10.59

0.19

10.18

1930

11.05

668

89

313

10.55

0.20

9.99

1940

10.91

757

85

418

10.40

0.22

9.85

1950

10.82

682

82

493

10.30

0.22

9.69

1960

10.81

616

62

540

10.30

0.23

9.66

1970

10.78

614

37

580

10.31

0.26

9.48

1981

10.67

721

31

509

10.21

0.27

9.25

1991

10.90

744

25

369

10.39

0.25

9.51

2001

11.39

837

13

237

10.83

0.30

9.81

Diagonality tests

−0.66 −0.65 −0.64 −0.62 −0.60 −0.59 −0.57 −0.54 −0.56 −0.60

−0.61 −0.60 −0.58 −0.56 −0.54 −0.53 −0.52 −0.50 −0.51 −0.54

10.67 10.61 10.56 10.41 10.32 10.32 10.32 10.23 10.41 10.87

−0.61 −0.60 −0.58 −0.56 −0.54 −0.53 −0.52 −0.50 −0.51 −0.55

αˆ

LM test:

30.274

–

–

LR test:

22.531

2.928

2.849

bˆ

ρˆ

−0.62

0.04

−0.60 −0.59 −0.57 −0.55 −0.53 −0.52 −0.50 −0.48 −0.49 −0.52

0.05 0.06 0.08 0.08 0.09 0.09 0.12 0.14 0.13 0.14

Spatial analysis of urban growth in Spain, 1900–2001

Table 1 Rank-size regressions, Spanish urban municipalities 1900–2001

Wald homogeneity test b parameter:

687

spatial parameter:

616

513

21

45

Spatial dependence LM-spatial-error:

217

–

–

LM-spatial lag:

81

–

–

LIK

–

6.833

6.778

AIC

–

26.34

48.36

Goodness of fit

65

α = lna OLS Ordinary least squares estimation, ML Maximum likelihood estimation. All coefficients are significant at the 1% level, JB The Jarque–Bera non-normality test on the residuals, KB The Koenker–Basset test for heteroskedasticity, KR The Kelejian–Robinson test for spatial autocorrelation in the error term, LIK The log-likelihood ratio test, AIC The Akaike Information Criterion. All statistics lead to the rejection of the corresponding null hypothesis at 1%

66

J. Le Gallo, C. Chasco

Lagrange Multiplier (LM) tests are less reliable. The Kelejian–Robinson statistics— though highly significant—cannot orient towards a spatial lag or spatial error formulation. In addition, we can also test for the existence of temporal correlation between the 11 equations under the form: E εt εs′ = σts I N with s, t = 1, . . . 11

(3)

where εt is an (N , 1) vector containing the N error terms for time period t and N = 722. This assumption of dependence between equations can be tested for by means of a LM test or a likelihood ratio (LR) test of the diagonality of the error covariance matrix. Note that this specification differs from the most familiar SUR design (Zellner 1962) with N fixed and T → ∞, where the regression coefficients are assumed to vary by crosssections (but are constant over time) and where the error terms are contemporaneously correlated. When the cross-sectional units pertain to spatial units, this latter assumption allows estimating nonparametrically cross-sectional dependence, interpreted as spatial autocorrelation, which is left unspecified as a general covariance (see Hordijk and Nijkamp 1977; White and Hewings 1982 for applications). In our case, N > T , so that the standard SUR is not appropriate and spatial autocorrelation should instead be expressed as a parameterized function. The SUR model can be estimated using FGLS or maximum likelihood (ML). The latter corresponds to iterated FGLS, yielding consistent and asymptotically normal estimates under the assumption of normality of errors. Spatial autocorrelation can be incorporated either in the form of a spatial lag or in the form of a spatial error term. In the first case, the model can be written as follows:

ln Rit = ρt ·

N

wi j ln R jt + ln at − bt · ln Sit + εit

(4)

j=1

where the error terms are as in (3) and where wi j is an element of a spatial weights matrix W . It is equal to 1 if urban municipality i is, at most, 160 km away from urban municipality j. The role of the spatial weights matrix is to introduce the notion of a neighborhood set for each of the urban municipalities.4 This model can also be estimated with ML. Note that for the spatial lag spatial SUR model, the three stage least squares estimation method has also been suggested when the assumption of normality is untenable and/or to avoid the computational problems associated with the Jacobian term in the ML estimation. However, in this case, appropriate instruments must be found and the estimation can yield explosive estimation of the spatial parameter, whereas it remains bounded with ML (see Anselin 1988; Anselin et al. 2007 for further technical details and Fingleton 2001 for an empirical application). Conversely, 4 We have also used a contiguity spatial weights matrix using a Thiessen polygonalization of the Spanish

territory for the 722 urban municipalities. The results are similar and can be obtained upon request from the authors.

Spatial analysis of urban growth in Spain, 1900–2001

67

the spatial error SUR model is as follows: ⎧ ⎨ ln Rit = ln at − bt · ln Sit + εit N wi j ε jt + u it ⎩ εit = λt ·

(5)

j=1

where u it are in the form of (3). Again, this model can be estimated with ML. Two LM tests, LMERR for spatial error and LMLAG for spatial lag, can be computed on the residuals of the spatial SUR model (Eqs. (2), (3)) in order to discriminate between a spatial lag or a spatial error specification (Anselin 1988). Moreover, the temporal stability of the coefficients (αt = ln at and/or bt ) and of the spatial coefficients (ρt or λt ) can be tested for in Eqs. (2), (4) and (5). However, due to the presence of spatial autocorrelation, the Wald tests used in this case must be adjusted. The computation of the LM and LR diagonality tests of the error covariance matrix as well as the Wald test on the homogeneity of the parameters across equations - points out to the superiority of a SUR specification over 11 individual equations (see Table 1). On the other side, both LM tests on spatial dependence reject the null of no spatial autocorrelation. The higher value of the LM test for spatial-error dependence is an indication that the spatial SUR spatial-error model (5) is more appropriate than a spatial lag one (4). Moreover, it shows a better performance in terms of goodness of fit (higher LIK and lower AIC). Figure 2 displays the evolution through time of the three estimations of the Pareto exponent. Though the OLS estimators for this parameter are always higher, they follow a similar evolution. From Fig. 2, it is clear that, in general terms, the estimation over time of the b parameter displays a decreasing trend until 1980, from which it starts to increase. As a result, we can deduce two different patterns over the course of the twentieth century: from 1900 to 1980, the size distribution of the set of 722 urban municipalities is increasingly divergent while from the 1980s to the end of the period this distribution becomes progressively even. Looking in more depth, we can also distinguish two other sub-periods in the first moment: from 1900 to 1930 (smaller divergence) and 1930–1981 (steeped divergence). This result is more or less consistent with Lanaspa et al. (2003) that found an inflexion date in the 1970s for the group of 100 Spanish largest cities. In our case, we find evidence of inflexion in the 1980s—instead of the 1970s—due to the composition of the sample. Indeed, when considering the whole set of urban units—not only a group of metropolises— we can capture suburbanization or deconcentration of individual processes, which were common in industrialized countries during the last decades of the twentieth century (Stanback 1991).5 In the case of Spain, though larger cities started to lose some population during the 1970s, we find evidence for a general phenomenon of counterurbanization during the 1980s, when the declining process of traditional monocentric 5 Fielding (1989) demonstrated that the change from urbanization to counter-urbanization occurred in

Spain during the 1980s, some years later than in most countries in Western Europe. Monclús (1997) and Esteve and Devolder (2004) also reached the same conclusion when analyzing urban growth for different sets of municipalities in Catalonia: the inflexion towards convergence in urban areas took place mainly during the 1980s as a consequence of a broad range of political and socio-economic changes.

68

J. Le Gallo, C. Chasco OLS

b PARAMETER

SSURLAG

0.67

SSURERR

0.60

0.53

0.46 1900 1910

1920 1930

1940 1950

1960

1970

1981

1991

2001

Fig. 2 Evolution of the estimations of the Pareto exponent (N =722). OLS OLS estimation; SSURLAG Spatial SUR lag model; SSURERR Spatial SUR spatial-error model

cities clearly benefited the peripheral cities, leading to the modern multicentric city. This fact can be interpreted as a consequence of a broad range of historical, sociopolitical and economic changes: end of the Franco dictatorship, generalized increase of family income, return of emigrants, crisis in housing supply inside the core cities, demand of more space for housing and industries, improvement in accessibility and motorization, etc. Consequently, the analysis of Zipf’s law leads to an interesting result, i.e., the existence of two main phases in the evolution of Spanish urban municipalities. The main one, which extends over 80 years, consists in an increase of urban concentration only broken after the 1980s. Inside this first stage, we can distinguish two sub-periods, in which the divergence course between urban municipalities has different speeds: • From 1900 to the 1930s, the b parameter displays a slower decreasing trend coinciding with a significant industrialization and urbanization expansion that led to progress and social changes in Spain. In the first decade, though most of the active population was located in the countryside, labor force began to migrate to the main industrial cities, e.g. Barcelona and Bilbao, as well as to Madrid and Valencia. Neutrality during World War I and capital stock growth (provided by American and international investments) helped the development of some industrial activities (only located in certain cities) that demanded more workers (Tuñón de Lara et al. 1982). Moreover, during the 1920s, industrialization and urbanization went on growing, especially in the Axis Madrid-North-Barcelona, leading to an incipient development of other satellite towns along the Cantabric Coast (Bilbao Estuary area, Santander, Asturian cities) and the Mediterranean Coast (Valencia and Alicante). However, during the mid 1930s, the economic crisis and the Civil War stopped the urbanization process (Tuñón de Lara and Malerbe 1982). • From 1940 to the 1970s, the b parameter experiences a quicker decline or, in other words, during this period the largest cities grew at significantly greater rates than the smallest population nuclei, exhibiting an intense divergent growth pattern. Indeed, during the 1940s, Spain lived under an autarkical regime that led to a real ruralization process: the main cities, destroyed after the Civil War, had to be re-built, hunger and poverty expelled a lot of people to the villages and, in general, urban population and active population decreased significantly.

Spatial analysis of urban growth in Spain, 1900–2001

69

Nevertheless, some big cities grew a lot, such as Madrid (due to the huge centralization and bureaucratization of the Regimen), Barcelona and other capitals (Valencia, Saragossa, Alicante and Seville). The incipient political and economic openness during the 1950s stopped the ruralization drive and set the basis for the decisive industrialization and tertiarization process experienced during the 1960s and 1970s (Tuñón de Lara and Viñas 1982). The industrial sector was severely constrained to make it more competitive and many workers had to migrate to Europe or to the Spanish capitals and new economic centers. Development was geographically irregular and affected only the cities located in richer provinces: Guipúzcoa, Biscay, Barcelona, Navarre, Madrid and Álava. Nevertheless, the Development Plans also created new economic poles, such as Vigo, Pontevedra, Coruña and Ferrol (in Galicia), Valladolid and Burgos (in Castile), Huelva, Cádiz, Seville (in Andalusia), Saragossa (in Aragón) and Badajoz (in Extremadura). If in 1960 only 30% of Spanish population lived in cities above 100,000 inhabitants, in 1975, urban population rose to 50%: Spain was no longer rural and became an industrial and urban country (Fusi et al. 1983). During the two last decades of the twentieth century, the Zipf’s parameters change from the 80-year decreasing tendency to a noteworthy increasing one. In other words, the group of 722 urban municipalities displayed a clear convergence growth pattern as the smallest towns grew faster than the largest cities. Actually, Spain went through a strong counter-urbanization process that is not finished yet. By the beginning of the 1980s, there was a peculiar urban structure similar to a star, with its centre in Madrid. In the axis, there were the vast Mediterranean metropolitan areas (GironaBarcelona-Tarragona, Castellón-Valencia-Alicante-Murcia), Andalusia (Seville and Cádiz), Galicia (A Coruña-Ferrol, Vigo) and the Cantabric Coast (Bilbao-San Sebastián, Santander, Gijón-Oviedo). In addition, inside this big star, there was a vast rural desert, only broken by a few urban oases, like Valladolid, Saragossa, Badajoz, Burgos, Vitoria and Pamplona. In the Islands, there was a similar process due to the huge growth of Palma (the Balearics), Las Palmas and Santa Cruz de Tenerife (the Canary Islands). The cities of Madrid and Barcelona grew towards their respective peripheries as did Valencia and Bilbao, although to a lesser extent. Indeed, the whole Basque Country was declared an “urban area”, as well as the Oviedo-Gijón-Avilés triangle (in Asturias) and the cities along the Mediterranean coast from Tarragona (in Catalonia) to Cartagena (in Murcia). The logical problems of the big cities (with uncontrolled growth in the peripheries and an incipient depopulation process of their historical centers) broke their later expansion in favor of middle-sized even small-cities and certain rural areas. Moreover, this de-urbanization of the largest cities was accompanied by some growth in their neighboring towns: suburban settlements gained many inhabitants and city centers were depopulated, restored and converted into CBD’s and/or historical/cultural cores. 3 Mobility within the Spanish urban system 1900–2001 The density functions and Zipf’s law allow the characterization of the evolution of the global distribution, but they do not provide any information about the movements of

70

J. Le Gallo, C. Chasco

the urban municipalities within this distribution. For example, they do not say whether the right tail of the initial distribution (year 1900) contains the same municipalities as the right tail in the final distribution (year 2001). A possible way to answer these questions is to track the evolution of each urban municipality’s relative size over time by estimating transition probability matrices associated with discrete Markov chains (Kemeny and Snell 1976). This line of analysis has been pursued by Eaton and Eckstein (1997) for Japanese and French urban areas and by Black and Henderson (1999, 2003) for the US urban system. 3.1 Markov chains The analysis of the evolution in time of an entire cross-section distribution, or distribution dynamics analysis, is a method aimed at describing the law of motion of the distribution as a Markovian stochastic process. In that respect, working in a discrete state–space has several advantages, as argued by Bulli (2001). Indeed, compared to continuous stochastic kernels,6 discrete probability distribution and transition matrices are easier to interpret: various descriptive indices and the long-run or ergodic distribution are easier to compute. On the other hand, this methodology raises the problem of arbitrary discretization. We will develop this latter problem when presenting the empirical results. Formally, denote Ft the cross-sectional distribution of municipal size (population) at time t relative to the Spanish average. Define a set of K different size classes, which provide a discrete approximation of the population distribution. We first assume that the frequency of the distribution follows a first-order stationary Markov process. In this case, the evolution of the municipal size distribution is represented by a transition probability matrix, M, in which each element (i, j) indicates the probability that a municipality that was in class i at time t ends up in class j in the following period.7 Formally, the (K , 1) vector Ft , indicating the frequency of the urban municipalities in each class at time t, is described by the following equation: (6)

Ft+1 = M Ft

where M is the (K , K ) transition probability matrix representing the transition between the two distributions as follows: ⎡

p11 ⎢ p21 M =⎢ ⎣ ..... pK 1

p12 p2 ...... pK 2

.. .. .. ..

⎤ p1K p2K ⎥ ⎥ ...... ⎦ pK K

(7)

6 For examples of studies using continuous stochastic kernels, see, among others, Quah (1997), Johnson

(2001), Fingleton and Lopez-Bazo (2003) and the references in therein. 7 The so-called Markov property implies that the future of a process depends only on its present class and

not on its history.

Spatial analysis of urban growth in Spain, 1900–2001

71

where each element pi j ≥ 0, Kj=1 pi j = 1. The stationary transition probabilities pi j capture the probability that a municipality in class in t − 1 ends up in class j in t. The elements of M can be estimated from the observed frequencies in the changes of class from one period to another. Thus, following Amemiya (1985) or Hamilton (1994), the maximum likelihood estimator of pi j is: pˆ i j =

ni j ni

(8)

where n i j is the total number of urban municipalities moving from class i in decade t − 1 to class j in the immediate following decade t over all the ten transitions and n i is the total sum of municipalities ever in i over the ten transitions. If the transition probabilities are stationary, that is, if the probabilities between two classes are time-invariant, then: Ft+s = M s Ft

(9)

In this framework, one can determine the ergodic distribution (also called the longterm, long-run, equilibrium or steady state distribution) of Ft , characterized when s tends toward infinity in Eq. (9), that is to say, once the changes represented by matrix M are repeated an arbitrary number of times. Such a distribution exists if the Markov chain is regular, that is, if and only if, for some m, M m has no zero entries. In this case, the transition probability matrix converges to a limiting matrix M ∗ of rank 1. The existence of an ergodic distribution, F ∗ , is then characterized by: F∗M = F∗

(10)

This vector F ∗ describes the future distribution of the urban municipalities if the movements observed in the sample period are repeated to infinity. Each row of M t tends to the limit distribution as t → ∞. According to Eq. (10), this limit distribution is therefore given by the eigenvector associated with the unit eigenvalue of M. The assumption of a first-order stationary Markov process requires the transition probabilities, pi j , to be of order 1, that is, to be independent of classes at the beginning of previous periods (at time t − 2, t − 3, …). If the chain is of a higher order, the first-order transition matrix will be misspecified. Indeed, it will contain only part of the information necessary to describe the true evolution of population distribution. Moreover, the Markov property implicitly assumes that the transition probabilities, pi j , depend on i (i.e., that the process is not of order 0). In order to test this property, Bickenbach and Bode (2003) emphasize the role of the test of time independence. In determining the order of a Markov chain, Tan and Yilmaz (2002) suggest, firstly, to test order 0 versus order 1; secondly, to test order 1 versus order 2; and so on. If the test of order 0 against order 1 is rejected, and the test of order 1 against order 2 is not rejected, the process may be assumed to be of order 1. To test for order 0, the null hypothesis H0 : ∀i : pi j = p j (i = 1, . . . , K ) is tested against the following alternative Ha : ∃i\ pi j = p j . The appropriate likelihood ratio

72

J. Le Gallo, C. Chasco

(LR) test statistic reads as follows: L R (O(0)) = 2

K

i=1 j∈Ai

n i j (t) ln

pˆ i j ∼ asyχ 2 (K − 1)2 pˆ i

(11)

assuming that pˆ > 0, ∀ j( j = 1, . . . , K ). Ai = j : pˆ i j > 0 is the set of nonzero transition probabilities under Ha. To test for order 1 versus 2, a second-order Markov chain is defined by also taking into consideration the population size classes k(k = 1, . . . , K ) in which the municipalities were at time t − 2 and assuming that the pair of successive classes k and i forms a composite class. Then, the probability of an urban municipality moving to class j at time t, given it was in k at t − 2 and in i at t − 1, is pki j . The corresponding absolute number of transitions is n ki j (t), with the marginal frequency being n ki (t − 1) = j n ki j (t − 1). To test H0 : ∀k : pki j = pi j (k = 1, . . . , K ) against Ha : ∃k : pki j = pi j , T the pki j are estimated as pˆ ki j = n ki j /n ki , where n ki j = t=2 n ki j (t) and n ki = T t=2 n ki (t − 1). The pi j are estimated from the entire data set as pˆ i j = n i j /n i .The appropriate LR test statistic reads as follows: LR

(O(1))

=2

K K

k=1 i=1 j∈Chi

K pˆ ki j 2 n ki j ln ∼ asyχ (12) (ci − 1) (di − 1) pˆ i j i=1

Similar to the notation above, Ci = j : pˆ i j > 0 , ci = #Ci , Cki = j : pˆ ki j > 0 and di = Di = #{k : n ki > 0}. If both Markovity of order 0 and of order 1 are rejected, the tests can be extended to higher orders by introducing additional dimensions for population size at time t − 3, t − 4, and so on. However, since the number of parameters to be estimated increases exponentially with the number of time lags, while the number of available observations decreases linearly for a given data set, the reliability of estimates and the power of the test decrease rapidly. Therefore, Tan and Yilmaz (2002) suggest setting an a priori limit up to which the order of the Markov chain can be tested. 3.2 Empirical results In order to carry out the methodology described above, a discretization of the continuous state-space must be chosen. However, as pointed out by Magrini (1999), Bulli (2001) or Cheshire and Magrini (2000), an improper discretization may have the undesired effect of removing the Markov property and therefore may lead to very misleading results, especially when the computation of ergodic distributions are based on the estimates of the discrete transition probabilities. Some authors (Quah 1993; Lopez-Bazo et al. 1999; Kawagoe 1999 or Le Gallo 2004) choose to discretize the distribution in such a way that the initial classes include a similar number of individuals. Conversely, Magrini (1999) or Cheshire and Magrini (2000) base their choice between possible

Spatial analysis of urban growth in Spain, 1900–2001

73

Table 2 Probability transition matrix, 1900–2001: Spain-relative population size 1 <20%

2 <50%

3 <80%

4 <135%

5 <185%

6 >185%

Number of observations 2567

1

0.944

0.053

0.002

0

0

0

2

0.040

0.879

0.074

0.005

0.002

0.001

1751

3

0

0.162

0.752

0.078

0.005

0.003

1029

4

0

0.001

0.184

0.741

0.066

0.008

852

5

0

0

0

0.273

0.632

0.095

315

6

0

0

0

0.001

0.061

0.938

706

classes in terms of the ability of the discrete distribution to approximate the observed continuous distribution. In this paper, we have tried numerous ways of discretizing the distribution, with different numbers of classes (5, 6, 7). Finally, the discretization has been chosen by considering the best performance of the test for order one, though we have tried to set up balanced classes even if it comes at some cost to this test. We distinguish between six different classes: (1) population less than 20% of the Spanish average, (2) population between 20 and 50% of the Spanish average, (3) population between 50 and 80% of the Spanish average, (4) population between 80 and 135% of the Spanish average (5) population between 135 and 185% of the Spanish average, and (6) population more than 185% of the Spanish average. Table 2 contains the first-order transition probability matrix between 1900 and 2001 with the ML estimates pˆ i j of the transition probabilities for population. For example, during the century, there were 2,567 instances of an urban municipality having a population size lower than 20 percent of the Spanish average. The majority of these municipalities (94.4%) remained in that size class at the end of the year, while 5.3% moved up one class by the end of the year. Note also that the transition probability matrix is regular. For the process in Table 2, Markovity of order 0 is tested by comparing each row of the transition matrix to the population distribution at time t using the test statistic (11). The result (L R = 16602.90; pr ob = 0; d f = 25) leaves no doubt that the process strongly depends on the initial condition at time t − 1, i.e., that the chain is at least of order 1. To test Markovity of order 1, six subsamples k = 1, . . . , 6 are defined, representing the urban municipalities’ size at time t − 2. Observations for municipalities that were in the first size class at time t − 2 are allocated to the first subsample (k = 1) and so on. For each of these subsamples, a separate matrix is estimated for observed transitions from time t − 1 to t in the usual way. The general test comparing the matrices for all five subsamples to the matrix for the entire sample simultaneously, similar to Eq. (12) above, results in L R = 198.12. This statistic is significant with 63 degrees of freedom ( pr ob = 0), indicating that the process under consideration is of a higher order, at least of order 2, if Markovian at all. However, there are a number of classes within subsamples for which we cannot expect reliable estimates of transition probabilities because there are only very few observations available. In addition, Fingleton (1983a,b, 1986, 1999) has argued that tests for Markov chains are inflated in the presence of

74

J. Le Gallo, C. Chasco

spatial autocorrelation, which is proved to exist in our case. Therefore, we decided to keep the assumption of order 1 for the Markov chain. We can make several comments about this matrix in Table 2, related to interclass movements, mobility speed, convergence pattern and influence of space. First, the high probabilities on the diagonal show a low interclass mobility, i.e., a high-persistence of urban municipalities to stay in their own class from one decade to another over the whole period. Diagonal elements of the transition approaching 1 have been interpreted as parallel growth by Eaton and Eckstein (1997). However, since these elements are not exactly 1, we can analyze the propensity of cities in each cell to move into other cells. In particular, it appears that the largest and smallest urban municipalities (classes 1 and 6, respectively) have higher persistence while medium-sized cities (categories 3, 4 and 5) have more probability of moving to smaller categories. In addition, in classes 2 and 3, a small number of urban municipalities move up to higher categories more than two steps, even reaching the top (some towns within the Madrid, Barcelona and Bilbao metro areas), while they only move down one cell. Nevertheless, only in class 2 the probability of moving up a class exceeds that of moving down. This low inter-class mobility of urban municipalities is in line with the results found for other cases such as US MSA’s (Black and Henderson 2003) or the largest Spanish cities (Lanaspa et al. 2003): changes in city-size are not drastic over a decade. However, the highest persistence of both the largest and smallest urban municipalities to stay in the initial state highlights the major role of medium-sized towns (10,000 to 70,000 inhabitants) in the processes of urban agglomeration and suburbanization that occurred in Spain during the twentieth century. In other words, on the one hand, a group of cities progressively moved to lower states in the distribution: cities located in the Northwest (Asturias and Galicia), some Castilian provinces inland (mainly Ciudad Real) and in the South of Spain (Andalusia, Alicante and Murcia). On the other hand, another group of medium-sized towns was thriving in Madrid, Barcelona, Bilbao and Seville metro areas, as well as some tourist enclaves in The Balearics, Canary Islands, Comunidad Valenciana, Almería and Málaga. These results demonstrate the existence of some spatial regularity in the main urban changes. Second, in order to determine the speed with which the urban municipalities move within the distribution, we consider the matrix of mean first passage time M P , where one element M Pi j indicates the expected time for a region to move from class i to class j for the first time. For a regular Markov chain, M P is defined as (Kemeny and Snell 1976, Chap. 4): M p = I K − Z + ee′ Z dg D

(13)

where I K is the identity matrix of order K , Z is the fundamental matrix: Z = (I K − M + M ∗ )−1 , M ∗ is the limiting matrix, e is the unit vector, Z dg results from Z setting off-diagonal entries to 0, and D is the diagonal matrix with diagonal elements 1/m ∗j . Table 3 displays the mean first passage time matrix for population. The mean number of years to reach any class is relatively high: the shortest time passage is 91.9 years and the longest is 3110.7 years. Globally, movements up are slower than movements

Spatial analysis of urban growth in Spain, 1900–2001

75

Table 3 Mean first passage time matrix in decades, 1900–2001: Spain-relative population 1 <20%

2 <50%

3 <80%

4 <135%

5 <185%

6 >185%

1

3.93

18.67

39.00

78.88

158,83

311.07

2

51.88

2,82

21.99

62.01

142.50

294.48

3

63.87

11.99

5.53

45.12

127.61

279.61

4

73.55

21.67

9.90

10.20

95.73

251.82

5

82.74

30.86

19.08

9.19

28.87

189.41

6

98.57

46.69

34.92

25.02

18.22

12.89

down, especially for high-size classes, i.e., the expected time to first move from class 5 to class 6 is 1894.1 years. Remember that these calculations account for the fact that starting from class 5, a site might visit classes 4, 3, 2 or 1 before going to class 6. From class 1 it takes 3110.7 years to first visit class 6, with the outstanding upward mobility of some metropolitan cities in the Madrid metro area (Alcobendas, Alcorcón, Coslada, Móstoles, Parla and Torrejón de Ardoz) and the Barcelona metro area (Santa Coloma de Gramenet) as examples of actual moves from state 1 to state 6. This result of faster declines shows that urban municipalities are more likely to lose population than to gain it, especially inland, in big capitals and old industrial centers (e.g. several cities in Andalusia, Asturias, Comunidad Valenciana and Murcia).8 This finding leads to the conclusion of the existence of a general progressive suburbanization process, as in other modern post-industrialized countries (Blakman et al. 1999), which has put an end to the era of big-city growth, favoring the progressive appearance of smaller population nuclei enjoying lower levels of congestion. This conclusion is also compatible with the 80-year phase of divergence, in size, between urban municipalities, only reversed during the last two decades, as pointed out by the Zipf’s parameter in Fig. 2. Third, we consider the ergodic distribution that can be interpreted as the long-run equilibrium urban municipality-size distribution in the urban areas system. Explicitly, given a regular transition matrix, with the passage of many periods, there will be a time where the distribution of urban municipalities will not change any more: that is the ergodic or limit distribution. It is used to assess the form of convergence in a distribution. Concentration of the frequencies in a certain class would imply convergence (if it is the middle class, it would be convergence to the mean), while concentration of the frequencies in some of the classes, that is, a multimodal limit distribution, may be interpreted as a tendency towards stratification into different convergence clubs. Finally, a dispersion of this distribution amongst all classes is interpreted as divergence. Ergodic distributions are computed for population size in Table 4. It appears that the ergodic distribution is more concentrated in the small-size municipalities (1st and 2nd classes), a result that reveals the existence of convergence towards smallersize populations. In addition, we find stability of ergodic distribution compared to the initial one, though there is slightly more probability in category 2. This outcome points to a very slight downward convergence, a result compatible with the kernel density 8 Again, this result contrasts with the US metro areas behavior (Black and Henderson 2003, pp. 358).

76

J. Le Gallo, C. Chasco

Table 4 Initial versus ergodic distributions 1900–2001: Spain-relative population size 1 < 20%

2 < 50%

3 < 80%

4 < 135%

5 < 185%

6 > 185%

Initial distribution

0.356

0.243

0.143

0.118

0.044

0.098

Ergodic distribution

0.254

0.355

0.181

0.098

0.035

0.078

function (Fig. 1) and transition matrix (Table 2) results. As we have mentioned earlier, the choice of a discretization may have a heavy impact on the determination of the ergodic distribution. Consequently, we have also computed other ergodic distributions with different discretization methods and different numbers of classes. For example, with 6 classes and the same number of individuals in each class, there is an increase in probability in the ergodic distribution (compared to the initial distribution) for the classes containing the municipalities with population less than 62% of the Spanish average. When the discretization is chosen so that it minimizes the value of the chisquare test of order 1 versus order 0, there is again an increase in probability for the classes containing the municipalities with a population less than 74% of the Spanish average. In fact, in every configuration we have tried, we observe an increase in probability in the ergodic distribution for the classes containing the municipalities with a population less than 60 to 75% of the Spanish average. We can therefore conclude that our result of slightly downward convergence is robust to the choice of discretization. Fourth, we have analyzed the influence of space on the transition probabilities, as in Le Gallo (2004) and Rey (2001). The relationship between the direction of an urban municipality’s transition in the population distribution and the relative populations of its neighbors is considered more generally in Table 5. The probability of a particular transition (Down, None, or Up) conditioned on the populations of the urban municipality’s neighbors at the beginning of the year is reported. There is clear evidence that the probability of an upward or downward move is different depending on the urban area context. For example, the probability for an urban municipality of moving up in the hierarchy is 7.1% when its spatial lag contains on average less population whereas it is 8% when it contains on average more population. Conversely, the probability for an urban municipality of moving down in the hierarchy is 18.9% when its spatial lag contains on average less population whereas it is only 3.9% (almost five times lower) when it contains on average more population. Specifically, the highest degree of spatial autocorrelation is of positive sign and occurs in nuclei with downward movements that are surrounded by smaller-size neighbors (18.9%). This situation takes place in declining areas with high levels of Table 5 Transition probabilities conditioned on the spatial lag of population

Spatial lag

Move Down

Same

Up

Less population

0.189

0.740

0.071

Same

0.057

0.898

0.045

More population

0.039

0.881

0.080

Spatial analysis of urban growth in Spain, 1900–2001

77

unemployment and emigration, which is the case of some inland capitals (Córdoba, Jaén) and heads of comarca9 (Guadix, Úbeda, Orihuela, Lorca), as well as old industrial settlements (Langreo, Mieres, Linares, Puertollano, Tortosa). Otherwise, the lowest degree of spatial autocorrelation is of negative sign and can be found in regressive towns, which are close to a big city absorbing their population (3.9%). This is the case, amongst others, of the urban municipalities of Huelva, which are attracted by the Seville metro area, or the cities of Teruel and Calatayud, which are close to Saragossa. It is also interesting to highlight that the process of counter-urbanization is related to positive spatial autocorrelation (8%), e.g. growing towns surrounded by a big metropolis. In these cases, the big city spills population out to the peripheral towns that benefit from their advantageous location. Most towns in the metro areas of Madrid and Barcelona share this situation, as well as some tourist municipalities along the Mediterranean coast and the Islands. Finally, there is another case of agglomeration characterized by negative spatial autocorrelation when a thriving city grows if it is surrounded by smaller towns (7%). This condition is present in some capitals that are still in an urbanization process (Málaga) or mainly in prosperous middle-sized cities that attract population from dying rural villages. Therefore, the influence of space on the urban municipality transition probabilities seems more important for downward movements. The influence of neighbors is confirmed by the χ 2 test statistic of independence of direction of move and neighbors population size, with 4 degrees of freedom, which generated a value of 398.087, which is significant at prob = 0. In conclusion, direction of movement in the population distribution of urban municipalities is not independent from the geographic environment. 4 Main conclusions The urbanization process has mainly taken place during the twentieth century producing significant processes of industrialization and economic growth. Specifically in Spain, this process has not been uniform and exhibits different shapes depending on the definition of “urban area”. In our case, we work with a set of 722 municipalities that make up the Spanish urban areas: main cities and their satellite towns. Zipf’s law shows the existence of two main phases in the evolution of these urban municipalities: 1900–1980 (divergence) and 1980–2001 (convergence). The main one extends over 80 years and consists in an increase in urban concentration, though two different sub-periods should be distinguished: 1900–1940 and 1940–1980. In effect, from 1900 to the 1930s, divergence is not that deep, coinciding with a significant industrialization and urbanization expansion that led to progress and social changes. However, this dynamism is violently broken by the end of the decade due to the Civil War. From 1940 to the 1970s, the largest cities grew much more quickly than the smallest population nuclei, leading to a more intense divergent pattern of growth. 9 The comarca is a historical -non official- division of Spanish provinces. This area is headed by a city that

used to be a historic settlement and traditional center of trade, transportation, administration and cultural activities.

78

J. Le Gallo, C. Chasco

During the last decades of the twentieth century, the Zipf’s parameters change from the 80-year decreasing tendency to a noteworthy increasing one. In other words, the group of 722 urban municipalities displayed a clear convergence growth pattern, since the smallest towns grew faster than the largest cities. The logical problems of the big cities (with an uncontrolled growth in the peripheries and an incipient depopulation process of their historical centers) halted their later expansion in favor of middle-sized even small-cities and certain rural areas. The Markov Chains analysis shows a low interclass mobility, i.e., a high-persistence of urban municipalities to stay in their own class from one decade to another over the whole period. However, the largest and smallest urban municipalities display higher persistence than the medium-sized cities, which have more probability of moving to smaller categories. This proves the major role played by the medium-sized cities in the processes of urban agglomeration and suburbanization occurred in Spain during the twentieth century. In general terms, movements up are slower than movements down, especially for high-size classes. This result of faster declines indicates that urban municipalities are more likely to lose population than to gain it, especially inland, in big capitals and old industrial centers. This conclusion is compatible with the 80-year phase of divergence in size between urban municipalities, only reversed during the last two decades. This is why population convergence is still slight and mainly “downwards” inside the group of urban municipalities. Finally, the probability of an urban municipality losing population (moving down in the hierarchy) is almost five times higher when it is surrounded by towns that contain, on average, less population. This result confirms the influence of space on urban population dynamism, also being more important for downward movements. In summary, the Spanish urban municipalities have experienced an agglomeration process throughout practically the whole of the twentieth century, only broken by an incipient suburbanization in the last decades. The main actors of these changes have been the medium-sized towns: on the one hand, some peripheral cities around the metro areas of Madrid, Barcelona, Bilbao and Seville have experienced strong growth whereas on the other hand, a considerable number of industrial nuclei and inland heads of comarcas have lost their old influence and size. The influence of space on urban change is certainly conspicuous. It leads to the so-called “two Spains”, which are no longer split along the usual North vs South partition. In the case of urban growth, another spatial division is relevant: on the one hand, there is a group of declining towns located in the Northwest, Center and South of Spain, which have progressively moved to lower states in the city-size distribution. On the other hand, there is another group of “winners” formed by most of the metro area cities (Madrid, Barcelona, Bilbao and Seville) and some tourist enclaves in The Balearics, Canary Islands, Comunidad Valenciana, Almería and Málaga.

References Amemiya T (1985) Advanced econometrics. Harvard University Press, Cambridge Anderson G, Ge Y (2005) The size distribution of Chinese cities. Reg Sci Urban Econ 35:756–776 Anselin L (1988) Spatial econometrics: Methods and models. Kluwer, Dordrecht

Spatial analysis of urban growth in Spain, 1900–2001

79

Anselin L, Le Gallo J, Jayet H (2007) Spatial panel econometrics. In: Matyas L, Sevestre P (eds) The econometrics of panel data, 3rd edn. Kluwer, Dordrecht Bickenbach F, Bode E (2003) Evaluating the Markov property in studies of economic convergence. Int Reg Sci Rev 26:363–392 Black D, Henderson V (1999) Spatial evolution of population and industry in the United States (AEA Papers and Proceedings). Am Econ Rev 89:321–327 Black D, Henderson V (2003) Urban evolution in the USA. J Econ Geogr 3:343–372 Blakman S, Garretsen H, Van Marrewijk C, Van den Berg M (1999) The return of Zipf: towards a further understanding of the rank-size distribution. J Reg Sci 39:183–213 Bulli S (2001) Distribution dynamics and cross-country convergence: a new approach. Scott J Polit Econ 48:226–243 Cheshire P, Magrini S (2000) Endogenous processes in European regional growth: convergence and policy. Growth Change 31:455–479 Cordoba JC (2003) On the distribution of city sizes. Mimeo, Economics Department, Rice University Davis D, Weinstein D (2002) Bones, bombs, and break points: the geography of economic activity. Am Econ Rev 92:1269–89 Dobkins L, Ioannides YM (2000) Dynamic evolution of the US city size distribution. In: Huriot JM, Thisse JF (eds) Economics of cities. Cambridge University Press, Cambridge, pp 217–260 Duranton G (2006) Some foundations for Zipf’s law: product proliferation and local spillovers. Reg Sci Urban Econ 36:542–563 Eaton J, Eckstein Z (1997) City and growth: theory and evidence from France and Japan. Reg Sci Urban Econ 17:443–474 Esteve A, Devolder D (2004) De la ley rango-tamaño (ranz-size) a la ley log-normal: los procesos aleatorios en el crecimiento demográfico de los agregados de población. VII Congreso de la Asociación de Demografía Histórica, Granada Fielding AJ (1989) Migration and urbanization in Western Europe since 1950. Geogr J 155:60–69 Fingleton B (1983a) Independence, stationarity, categorical spatial data and the chi-squared test. Environ Plann A15:483–499 Fingleton B (1983b) Log-linear models with dependent spatial data. Environ Plann A15:801–814 Fingleton B (1986) Analyzing cross-classified data with inherent spatial dependence. Geogr Anal 18:48–61 Fingleton B (1999) Estimates of time to economic convergence: an analysis of regions of the European Union. Int Reg Sci Rev 22:5–34 Fingleton B (2001) Theoretical economic geography and spatial econometrics: dynamic perspectives. J Econ Geogr 1:201–225 Fingleton B, Lopez-Bazo E (2003) Explaining the distribution of manufacturing productivity in the EU regions. In: Fingleton B (ed) European regional growth. Springer, Heidelberg, pp 375–409 Fusi JP, Vilar S, Preston P (1983) De la dictadura a la democracia. Desarrollismo, crisis y transición. Historia 16, vol. XXV Gabaix X (1999a) Zipf’s law and the growth of cities (AEA Papers and Proceedings). Am Econ Rev 89:129–132 Gabaix X (1999b) Zipf’s law for cities: an explanation. Quar J Econ 114:759–767 Gabaix X, Ibragimov R (2006) Rank −1/2: a simple way to improve the OLS estimation of tail exponents. Available for download at: http://econ-www.mit.edu/faculty/index.htm?prof_id=xgabaix&type=paper Gabaix X, Ioannides YM (2004) The evolution of city size distributions. In: Henderson V, Thisse JF (eds) Handbook of regional and urban economics, vol 4. North Holland, Amsterdam, pp 2341–2378 Grossman G, Helpman E (1991) Innovation and growth in the world economy. MIT Press, Cambridge Hamilton JD (1994) Time series analysis. Princeton University Press, Princeton Hordijk L, Nijkamp P (1977) Dynamic models of spatial autocorrelation. Environ Plann A9:505–519 Ioannides YM, Overman HG (2003) Zipf’s law for cities: an empirical examination. Reg Sci Urban Econ 33:127– 137 Johnson PA (2002) A nonparametric analysis of income convergence across the United States. Econ Lett 69:219–223 Kawagoe M (1999) Regional dynamics in Japan: a reexamination of Barro regressions. J Jpn Int Econ 13:61–72 Kemeny J, Snell L (1976) Finite Markov chains. Springer, New York Krugman P (1996) The Self-organizing economy. Blackwell, Cambridge

80

J. Le Gallo, C. Chasco

Lanaspa L, Perdiguero AM, Sanz F (2004) La distribución del tamaño de las ciudades en España, 1900– 1999. Revista de Economía Aplicada 34:5–16 Lanaspa L, Pueyo F, Sanz F (2003) The evolution of Spanish urban structure during the twentieth century. Urban Stud 40:567–580 Le Gallo J (2004) Space-time analysis of GDP disparities among European regions: a Markov Chains approach. Int Reg Sci Rev 27:138–163 Lopez-Bazo E, Vaya E, Mora AJ, Suriñach J (1999) Regional economic dynamics and convergence in the European Union. Ann Reg Sci 33:343–370 Magrini S (1999) The evolution of income disparities among the regions of the European Union. Reg Sci Urban Econ 29:257–281 Mella JM, Chasco C (2006) A spatial econometric analysis of urban growth and territorial dynamics: a case study on Spain. In: Nijkamp P, Reggiani A (eds) Spatial evolution and modeling. Edward Elgar, pp 319–360 Ministerio de Fomento (2000) Atlas Estadístico de las Áreas Urbanas en España, Subdirección General de Urbanismo, Madrid Monclús FJ (1997) Planeamiento y crecimiento suburbano en Barcelona: de las extensiones periféricas a la dispersión metropolitana (1897–1997). Coloquio sobre El desarrollo urbano de Montréal y Barcelona en la época contemporánea: estudio comparativo, Universidad de Barcelona Nitsch V (2005) Zipf zipped. J Urban Econ 57:86–100 Overman HG, Ioannides YM (2001) Cross-sectional evolution of the US city size distribution. J Urban Econ 49:543– 566 Pareto V (1897) Cours d’Economie Politique. Rouge et Cie, Paris Quah D (1993) Empirical cross-section dynamics in economic growth. Euro Econ Rev 37:426–434 Quah D (1996) Empirics for economic growth and convergence. Euro Econ Rev 40:1353–1375 Quah D (1997) Empirics for growth and distribution: stratification, polarization, and convergence clubs. J Econ Growth 2:27–59 Rey S (2001) Spatial empirics for economic growth and convergence. Geogr Anal 33:195–214 Rossi-Hansberg E, Wright M (2004) Urban structure and growth. Mimeo, Stanford University, Economics Department Soo KT (2005) Zipf’s law for cities: a cross-country investigation. J Urban Econ 35:239–263 Stanback T. Jr (1991) The new suburbanisation: challenge to the central city. Westwiew Press, Boulder Tan B, Yilmaz K (2002) Markov chain test for time dependence and homogeneity: an analytical and empirical evaluation. Euro J Oper Res 137:524–543 Tuñón de Lara M, Malerbe PC (1982) La caída del rey. De la quiebra de la Restauración a la República (1917–36). Historia 16, vol. XXIII Tuñón de Lara M, Viñas A (1982) La España de la cruzada. Guerra civil y primer franquismo (1936–1959). Historia 16, vol. XXIV Tuñón de Lara M, Bahamonde A, Toro, J, Arostegui J (1982) La España de los caciques. Del sexenio democrático a la crisis de 1917. Historia 16, vol. XXII White E, Hewings GJD (1982) Space-time employment modelling: some results using seemingly unrelated regression estimators. J Reg Sci 22:283–302 Zellner A (1962) An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. J Am Stat Assoc 57:348–368 Zipf GK (1949) Human behavior and the principle of least effort. Addison-Wesley, Cambridge

A class of spatial econometric methods in the empirical analysis of clusters of firms in the space Giuseppe Arbia · Giuseppe Espa · Danny Quah

Abstract In this paper we aim at identifying stylized facts in order to suggest adequate models for the co-agglomeration of industries in space. We describe a class of spatial statistical methods for the empirical analysis of spatial clusters. The main innovation of the paper consists in considering clustering for bivariate (rather than univariate) distributions. This allows uncovering co-agglomeration and repulsion phenomena between the different sectors. Furthermore we present empirical evidence on the pair-wise intra-sectoral spatial distribution of patents in Italy in 1990s. We identify some distinctive joint patterns of location between different sectors and we propose some possible economic interpretations. Keywords Agglomeration · Bivariate K-functions · Co-agglomeration · Spatial clusters · Spatial econometrics

A previous version of this paper was presented at the Workshop on Spatial Econometrics and Statistics, held in Rome 25–27 May 2006. We wish to thank the participants for the useful comments received. The comments received by two anonymous referees are also gratefully acknowledged. They improved substantially the quality of our work. G. Arbia (B) Department of the Business, Statistical, Technological and Environmental Sciences, University “G. d’Annunzio” of Chieti-Pescara, Viale Pindaro, 42, 65127 Pescara, Italy e-mail: [email protected] G. Espa Department of Economics, University of Trento, via Inama, 5, 38100 Trento, Italy e-mail: [email protected] D. Quah Economics Department, London School of Economics, Houghton Street, WC2A 2AE London, UK e-mail: [email protected]

82

G. Arbia et al.

JEL Classification C21 · D92 · L60 · O18 · R12 1 Introduction The dominating feature of economic activities is certainly that of clustering both in space and time. However, even if the application of statistical techniques for modelling clustering in time (business cycles) has been a central concern of applied economists for decades, it is only relatively recently that the research has concentrated on the development of appropriate methods to detect spatial clustering of economic activities both on a discrete space and on a continuous space. The possibility of modelling the spatial dimension of economic activities is of paramount interest for a number of reasons. First of all the study of spatial concentration of economic activities can shed light on economic theoretic hypotheses concerning the nature of increasing returns and the determinants of agglomeration. These hypotheses are of paramount importance in international trade and in economic growth theories. A second important reason is constituted by the fact that the effects of policy measures to foster economic growth and development are strongly dependent on geographical clustering. Finally spatial clustering, as a synonym of regional inequality, is a central political issue as a proxy of individual inequality and as the basis for cross-country inequality. After the recent reinterpretation of Marshall’s (1890) insights on nineteenth century industrial clustering in space due mainly to Krugman (1991) and Fujita et al. (1999), the empirical analysis on this subject has developed along two distinct lines of research. Along the first of these two lines in the literature we record attempts to examine directly the underlying economic mechanism, using the spatial dimension primarily as a source of data. Under this respect panel data or pure spatial regressions are used, employing observable covariates related to space. Such regressions end up constructing a hypothetical representative unit (a “site”) and concentrate on the impact of differing covariate values on the performance of that representative unit (see e.g., Ciccone and Hall 1996; Jaffe et al. 1993; Rauch 1993; Henderson 2003). Here we follow the second line of research which attempts to characterize the entire spatial distribution of economic activities relative to a set of hypotheses (e.g., a certain regularity patterns of industrial concentration). In this second instance interest does not rest on the characteristics of a representative unit, but rather on the joint behaviour of the different units distributed across space. Along these lines Duranton and Overman (2005) refer to three generations of measures of spatial concentration. A first generation considered Gini-type measures where space played no rule (e.g., Krugman 1991). A second generation (perhaps initiated by Ellison and Glaeser 1997) introduced measures that take into account space and tend to control for the underlying industrial concentration (Maurel and Sedillot 1999; Devereux et al. 2004). Such measures are based on data observed on a grid of administrative areas thus neglecting the problem of the arbitrariness of the geographical partition used. This problem is known in the statistical literature as the modifiable unit problem (Yule and Kendall 1950) and assumes here the specific facet of the modifiable areal unit problems (or MAUP) discussed at length in Arbia (1989). Arbia (2001) and Duranton and Overman (2005)

A class of spatial econometric methods

83

provide an exhaustive account of the advantages of a third generation of measures that looks at maps of points on a continuous, rather than on a discrete, space: following this approach we can easily compare the results across different scales. In the present paper we look at the distribution of industries as a map of points in the space and we characterize their geographical distribution through the Lotwick– Silverman (Lotwick and Silverman 1982) extension of the spatial statistical nonparametric tool known as the K -function (Ripley 1977). The use of the K functions in economic analysis was first introduced in the literature by Arbia and Espa (1996) and then exploited by Marcon and Puech (2003a,b), by Quah and Simpson (2003) and by Duranton and Overman (2005) amongst the others. Ioannides and Overman (2004) look also at the properties of points on a continuous space, but using a different approach. Compared to previous contributions the major innovation of the present paper is to consider clustering for bivariate (rather than univariate) distributions, which allows uncovering agglomeration and repulsion phenomena between different sectors. The literature often refers to this subject as to co-agglomeration or co-localization (Devereux et al. 2004; Duranton and Overman 2005). The functionals we propose to characterize a joint pattern of points satisfies the five conditions suggested by Duranton and Overman (2005) for a concentration measure.1 In particular we introduce the use of the K function to control for the underlying industrial concentration. The theoretical ground on which such a modelling framework is based is well described in Quah and Simpson (2003) where the authors stressed the importance of being able to specify a model determining a spatial law of motion and its evolution through time. Quah–Simpson model assumes that each individual economic agent aims at maximizing his own profit by locating his activity where the average spatial return is higher with a gradual time-adjustment based on a cost function. Equilibrium is then achieved by optimizing the choices of each individual economic agent. The result is a (space–time) partial differential equation that expresses the economic activity in each point in time and space as a density. A good way of approaching the empirical analysis of clusters in space is then represented by a modelling framework that is able to describe how such densities change in space conditionally upon the observed pattern of points. A field were what Quah and Simpson (2003) define as a spatial law of motion is particularly theoretically grounded is the analysis of the spatial diffusion of innovation and of technological spillovers. For this reason the empirical part of this paper is devoted to the analysis of the joint location of innovations. The studies on knowledge spillovers have received increasing importance in the literature on economic growth. In fact some theories explicitly link the presence of innovations to the growth of cities (Jacobs 1969, 1984; Bairoch 1988) seen as the places where the big concentration of individuals, firms and workers create positive

1 The five requirements suggested by Duranton and Overman (2005) for a concentration measure are

the following: any measure (1) should be comparable across industries; (2) should control for the overall agglomeration of manufacturing; (3) should control for industrial concentration; (4) should be unbiased with respect to scale and aggregation; and, finally (5) should also give an indication of the significance of the results.

84

G. Arbia et al.

externalities which, in turn, foster economic growth. Even if economic theory has produced important advances in this direction (Arrow 1962; Romer 1986; Lucas 1988; Porter 1990) empirical evidence are still largely lacking. In fact a large part of the empirical literature concentrated on measuring the impact of technological spillovers on the innovation performance of regions. In many instances the number of patents and the relative citations have been used as proxies of the flow of knowledge and of the related innovative output (Jaffe and Trajtenberg 1996, 2002; Jaffe et al. 2000). One possible approach is based on the notion of knowledge production function introduced by Griliches (1979) which links regional innovative output with measures of regional innovative inputs like R&D expenses (see e.g., Jaffe 1989; Audretsch and Feldman 1996; Acs et al. 1994). These studies provide significant evidence of the impact of localized R&D inputs on the innovation performance of regions. There is comparatively less empirical evidence on the effects of localized knowledge spillovers (Glaeser 1992; Henderson and Kunkoro 1995; Henderson 2003) and no definite answer is yet available to the question whether knowledge flows are favoured by regional specialization within firms or, vice versa, by industrial diversification. In this paper we wish to show the importance of the distance-based measures of spatial concentration in tackling this important emerging research area and to provide new statistical tools to study the interaction between spatial concentration, regional growth and knowledge spillovers. The layout of the paper is the following. In Sects. 2 and 3 we will thoroughly review the statistical reference framework by presenting a set of tools to identify clusters of industries in the space. Specifically in Sect. 2 we will concentrate on the bivariate version of Ripley’s (1977) K function. In Sect. 3 we will discuss the system of hypotheses at the basis of the identification of clusters and by distinguishing two possible null hypotheses to be contrasted with the hypothesis of spatial clustering of industries. Section 4 is devoted to an empirical application of the bivariate K function in the study of the inter-sectoral location of innovation in Italy based on a dataset of the European Patent Office (EPO). Finally Sect. 5 contains some conclusions and directions for further developments in the field.

2 The statistical theoretical framework: bivariate K functions Univariate K -functions (proposed by Ripley 1976, 1977) have been already used in economic geography to characterize the geographical concentration of industries (see e.g., Arbia and Espa 1996; Marcon and Puech 2003a; Quah and Simpson 2003). In this paper we will consider a bivariate extension of such a method to describe spatial clusters of pairs of firms. Although the approach could be straightforwardly generalized to an arbitrary number of, say, g(g > 2) industries, in this paper we will deliberately restrict ourselves to the bivariate case for the sake of illustrating the methodology. The method is based on a bivariate functional of distance t (that we will refer to as K i j (t)) which characterizes the joint spatial pattern of points or, more precisely, the spatial relationships between two typologies of points located in the same study area: for instance firms belonging to two different industrial sectors, say Type i and Type j. The bivariate K function is defined as follows:

A class of spatial econometric methods

85

K i j (t) = λ−1 j E {# of points of Type i falling at a distance ≤ t from an arbitray Type j point}

(1)

with E{.} indicating the expectation operator and the parameter λ j representing the intensity of Type j point process, that is the number of Type j points per unitary area. Obviously, in the presence of a multivariate point process we have g typologies of events and, consequently, g 2 bivariate K functions that is: K 11 (t), K 12 (t), . . . , K 1g (t), K 21 (t), K 22 (t), . . . , K 2g (t), . . . , K gg (t). In the remainder we will distinguish between univariate and bivariate K functions by calling auto-functions the K functions when i = j and cross-functions those when i = j. Conversely, when i = j, K ii = K i represents the more traditional univariate auto-function K used in the economic analysis by e.g., Quah and Simpson (2003) and Marcon and Puech (2003a) and Duranton and Overman (2005) In Eq. 1, the term λ j K i j (t) represents the expected number of Type j points falling within a circle of radium t centred on an arbitrary Type i point. Symmetrically we interpret the bivariate function K ji (t) in such a way that λi K ji (t) represents the expected number of Type i points falling within a circle of radium t centred on an arbitrary Type j point. Similarly to the case of univariate K function, also the bivariate K function is built under the assumption of isotropy (Arbia 2006) that is the case when no directional bias occurs in the neighbourhood of each point. In a bivariate point map constituted by, say, n i points of Type i and n j points of Type j within an area A, we can define a class of estimators of the cross-functions K by close analogy to those suggested in the univariate case (Ripley 1977; Diggle 1983). To start with, let us consider the indicator function: Ilk (t) =

1 0

if if

dlk ≤ t dlk > t

where dlk represents the distance between the lth Type i point and the kth Type j point. If no border effects are present, then the non-parametric estimator of the cross-function K i j (t) can be expressed as: n1 n2 −1 Ilk (t) Kˆ i j (t) = λˆ i λˆ j A l=1 k=1

n where A is the total surface of the area, λˆ i = nAi and λˆ j = Aj . Analogously, by inverting the role between Type i and Type j points, the corresponding non-parametric estimator for the cross-function K ji (t) is given by: n2 n1 −1 −1 ˆ ˆ ˆ νlk K ji (t) = λi λ j A (t)Jlk (t) l=1 k=1

with νlk (t) and Jlk (t) analogous to the Ilk (t) functions in the previous expression.

86

G. Arbia et al.

If the generating random field is stationary and isotropic (Arbia 2006), then K i j (t) should be equal to K ji (t). However, due to possible border effects2 and to the asymmetry of the related corrections, Kˆ i j (t) and Kˆ ji (t) will be not exactly equal although strongly correlated. A more efficient (although not absolutely efficient) estimator is thus the one proposed by Lotwick and Silverman (1982) given by: K i∗j (t) =

λˆ j Kˆ i j (t) + λˆ i Kˆ ji (t) λˆ i + λˆ j

(2)

Likewise Ripley’s univariate K function, also in the case of the multivariate functions we can introduce the L transformation proposed by Besag (1977) that is characterized by a more stable variance. In the bivariate case the Lˆ i j (t) functions assume the following expressions: Lˆ i j (t) =

Kˆ i j (t)/π

Lˆ ji (t) =

Kˆ ji (t)/π

and

where the functions are linearized dividing by π and the square root stabilizes the variance. Similarly to Eq. 2 we can consider the Lotwick–Silverman transformation: L i∗j (t) =

λˆ j Lˆ i j (t) + λˆ i Lˆ ji (t) ∗ = K ji (t)/π λˆ i + λˆ j

which produces more efficient estimators of the L function. 3 The basic hypotheses of the model: spatial independence or random labelling? In this section we wish to introduce various alternatives offered in the spatial statistical literature to specify the null hypothesis of absence of regularities in the location of pairs of points in space. These will represents our counterfactuals in the subsequent empirical analysis reported in Sect. 4.2. In order to correctly interpret the estimates provided by K i∗j (t) and L i∗j (t) to test the null hypothesis of absence of spatial interaction, traditionally the empirical estimates are compared with simulated envelopes. The reference framework for such tests is provided by Barnard (1963) and adapted by Ripley (1979) to the case of univariate spatial clusters. In the case of bivariate patterns the specification of the null hypothesis is more complicated. In fact we can have two possible definitions depending on the nature of the case examined: a null of 2 On “border effects” and corrections for them see e.g., Ripley (1981). Explicit expressions of the correction

factors in the case of irregular study areas are derived in Goreaud and Pélissier (1999).

A class of spatial econometric methods

87

independence and a null of random labelling (Diggle 1983; Dixon 2002). The choice between the two alternatives can strongly affect the final results and can lead to wrong conclusions. However, this distinction is often ignored in the literature where the univariate procedures are sometimes uncritically applied (among the few exceptions see Diggle 1983 and, more recently Dixon 2002; Goreaud and Pélissier 2003). The two cases will now be discussed in some details. 3.1 The null hypothesis of independence According to a first specification of the null hypothesis the two typologies of points on the map can be conceived as two populations and the resulting spatial pattern can be interpreted a priori as the outcome of two distinct point random fields. In this situation the absence of interaction between the two components corresponds to the lack of interaction between the two generating fields. In other words, the location of points generated by the field related to Typology i is independent of the location of points generated by the field related to Typology j (Lotwick and Silverman 1982). Under this hypothesis, therefore, we have that K i j (t) = π t 2 . We will refer to this first null hypothesis as to the “hypothesis of independence” and we will indicate it with the symbol H01 . If within the circle of radius t centred on an arbitrary Type i point we record the presence of more Type j points than we expect under H01 , then K i j (t) > πt 2 which represents the surface of a circle of radius t. Such a result indicates a positive dependence between the two components and, hence, the presence of agglomeration between the two generating fields. In contrast, if within the circle of radius t centred on an arbitrary Type i point we record the presence of less Type j points than expected under H01 , then K i j (t) < πt 2 , thus indicating repulsion (or inhibition) rather than agglomeration. The confidence band to run formal hypothesis testing procedures at the various distances can be built through Monte Carlo simulation (Besag and Diggle 1977; See also Ripley 1977; Goreaud and Pélissier 2003 for details). 3.2 The null hypothesis of random labelling According to a second specification of the null hypothesis each of the two components depend on some factors that a posteriori produce a differentiation between the two typologies of points. In the case of economic data, such factors can be identified in a set of explanatory variables encouraging location of industries at a certain point in space and producing a different pattern in the two typologies of points. For instance they might refer to a differentiated system of taxes and incentives encouraging location of Type i point while discouraging location of Type j points. We will refer to this second hypothesis as to the “random labelling” and we will indicate it with the symbol H02 . The general reference framework in this second instance is that of the so-called marked point processes (see, e.g., Diggle 1983) that is point processes where not only the location of each object is reported, but also an extra characteristics that differentiates between them (e.g., small and large firms, presence or absence of an innovation etc.)

88

G. Arbia et al.

Let the spatial structure of the indistinct generating process for the two typologies of points be synthesized by the univariate K auto-function (Ripley 1977): λK (t) = E{# points within a circle of radious t around each point in a map} where λ = nA represents the density and n = n 1 + n 2 . n Under H02 , the ratio p j = nj represents the probability of belonging to Typology j. Then we have that p j λK (t) = λ j K (t) = λ j K i j (t) so that K i j (t) = K (t). If within the circle of radius t centred on an arbitrary Type i point there are more Type j points than expected under H02 , then K i j (t) > K (t). This result indicates that at distance t the two components tend to be positively dependent, thus revealing the presence of agglomeration. On the contrary, if within the circle of radius t centred on an arbitrary Type i point there less Type j points than expected under H02 , then K i j (t) < K (t) which indicates the presence of a negative dependence between the two components or inhibition. Again the confidence bands can be generated via Monte Carlo simulation. When H02 holds true, we have that all the bivariate K functions (both the two auto-functions and the two cross–functions) are equal to the univariate K function of the map where there is no distinction between the two components so that K i j (t) = K ji (t) = K ii (t) = K j j (t) = K (t). Operationally the departures from the null of random labelling could be evaluated by computing the pair-wise differences between the various K functionals and by comparing them with the simulated confidence bands (see Diggle and Chetwynd 1991; Gatrell et al. 1996; Kulldorff 1998; Dixon 2002; Haining 2003). It is important to observe that the two alternatives for the null hypothesis considered in this section describe in statistical terms the usual distinction made in quantitative geography between two otherwise undistinguishable effects: the effect of spatial interaction between agents and the effect of spatial reaction to common factors (Cliff and Ord 1981). Both effects give rise to observed regularities in space. They also mirror from a certain view angle the distinction made by some authors in the economic literature between joint-localization and co-localization (see e.g., Duranton and Overman 2005). 4 Characterizing the spatial distribution of innovations in Italy 1995–1999 4.1 Descriptive analysis The empirical analysis focuses on the use of the EPO dataset containing all patent applications made at European Patent Office (established by Monaco’s Convention) starting from 1978. In particular we use the version elaborated at CESPRI, Bocconi University, Milan. The dataset provides us with information about petitioners, inventors, request date, International Patent Classification code (distinguishing the various industrial sectors), and citations among patents. The use of the inventors’ residence allows us to localize exactly each patent (Arbia et al. 2008). In our database we omitted Sicily, Sardinia and the other Italian islands because of the lack of spatial continuity with the main land. This lack of continuity could produce serious biases in our

A class of spatial econometric methods

89

5000

number of paten

number of invento

6000

4000 3000 2000 1000 0 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 t

2000 1800 1600 1400 1200 1000 800 600 400 200 0 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 t

Fig. 1 a Inventors with at least 1 invention and b mono-inventors in the period 1990–1999. Source: European Patent Office data processed at CESPRI, Bocconi University, Milan and at the Department of Economics, University of Trento

procedures that are based on Euclidean distances. The resulting dataset consists of 44,078 inventors, but only 25,312 patents due to the presence of multi-inventors patents. To avoid any subjectivity in the assignment of a unique location to each patent, we considered only patents with one single inventor that amount to 14,632 in our database. We left to future further refinements the problems related to the spatial assignment of patents with multiple inventors. The patents are classified within six industrial sectors, namely: Electricity Electronics (S1), Instruments (S2), Chemical Pharmaceutical (S3), Process Engineering (S4), Mechanical Engineering Machinery (S5) and Consumer Goods Civil Engineering (S6). In our database we also avail temporal information related to two different moments of time of the registration of each patent, namely: (i) the publication date and (ii) the priority date, that is the date of the earliest filling of an application made in one of the patent offices adhering to the convention. The choice of one date or the other is crucial because the time lag between the priority date and the publication date may range from 1.5 to 2.5 years. We chose the latter because it is the date that gets closer to the actual timing of the patented invention. In the modelling phase, we restrict our attention to the most recent years and so we employed only data based on the aggregation of the last 5 years (from January 1995 to December 1999). The dynamic is totally absent in the present analysis and it is our intention to extend the analysis in a future study to a space–time context by exploiting the statistical literature on “space–time” K -functions introduced by Diggle (1993) and Diggle et al. (1995). Figure 1 reports the yearly time series of the 44,078 inventors submitting for patents at EPO (Fig.1a) and those that are mono-inventors in which case the spatial location of inventors and patents are the same (Fig.1b). The selected period 1995–1999 coincides with an evident period of increase in the volume of inventors. With reference to the same dataset Table 1 reports the sectoral distribution of patents with just one inventor in the period 1995–1999. By distinguishing each patent in terms of its specific industrial sector we can interpret our data as a single realization of a multivariate spatial point process. The map reported in Fig. 2 displays the overall spatial distribution of the 8,279 cases. The map reveals a clear agglomeration pattern of the points within some innovative regions, namely Milan’s area in the centre–north and the north-eastern part of the country. Indeed, the whole northern regions are very innovative and contain more than

90 Table 1 Frequency distribution of the number patents with only one inventor distinguished by industrial sector in the periods 1995–1999

G. Arbia et al. Sectors Electricity electronics = S1

951

Instruments = S2

831

Chemical pharmaceutical = S3 Source: European Patent Office data processed at CESPRI, Bocconi University, Milan and at the Department of Economics, University of Trento

Number of patents (1995–1999)

675

Process engineering = S4

1,302

Mechanical engineering machinery = S5

2,830

Consumer goods civil engineering = S6

1,690

Total

8,279

Fig. 2 Location of 8,279 patents in Italy in 1995–1999. Source: European Patent Office data processed at CESPRI, Bocconi University, Milan and at Department of Economics, University of Trento

66% of the total number of patents. Conversely the central part of Italy contains only about the 30%, of the patents with a remarkable concentration around Rome and Florence, and only 4% of the patents are located in the southern regions. These empirical findings are not surprising considering the well-known industrial gap existing between Italian regions and, specifically the dualism between the north and the south of the country. Also, the recent literature on knowledge spill-overs and specifically the research that concentrated on patent citation (Jaffe et al. 1993; Breschi and Lissoni 2001, 2006; Driffield 2006) provides persuasive explanation for the observed geographical pattern. The same kind of information is contained in the graphs reported in Fig. 3 displaying the frequency distribution of inventors (Fig. 3a) and the ratio of the number of investors to the number of individuals aged 15–65 in 1999 (Fig. 3b). In particular Fig. 3b shows that the higher concentration observed in the north cannot be due to population concentration and reveals a genuine prevalence of inventors in that area. It is helpful to disaggregate the previous map into as many point processes as the number of industrial sectors, each one characterised by a different intensity. To help visualizing this aspect Fig. 4 reports the point map for each sector. The same feature

A class of spatial econometric methods

91

Fig. 3 Regional distribution of patents in Italy in the period 1995–1999. a Number of patents, b patents per 1,000 inhabitants. Source: Istat and European Patent Office data elaborated at CESPRI, Bocconi University, Milan and at Department of Economics, University of Trento

Fig. 4 Location of 8,279 patents in Italy in 1995–1999 distinguished by sector. Source: European Patent Office data elaborated at CESPRI, Bocconi University, Milan and at Department of Economics, University of Trento. a Electricity Electronic, b Instruments, c Chemical Pharmaceutical, d Process Engineering, e Mechanical Engineering Machinery, f Consumer goods Civil Engineering

of higher concentration in the north appears evident for all six sectors considered. Indeed, the observed geographical patterns are very similar for the six sectors with evident concentrations around the main industrial northern towns (Milan and Turin). Some minor, but still evident, concentrations can be observed around Bologna and Venice for patents of the Process Engineering, Mechanical Engineering Machinery and Consumer goods Civil Engineering sectors and on the east coast between Rimini

92

G. Arbia et al.

Fig. 5 Kernel estimation of the spatial density of the two point process .The bandwidth parameter is set to τ = 50 km. a Electricity Electronics, b Instruments

and Ancona for the patents of the sector Consumer goods Civil Engineering (see Fig. 4f). 4.2 Modelling the joint location of patents In this section we will first of all illustrate with a certain detail the method used for estimating the bivariate K functions. For the purpose of illustrating the estimation method we will concentrate on the spatial interaction between patents of the Electricity Electronics sector and those of the Instruments sector. For all other pairs of sectors we will only report the results of the estimation procedures. In order to perform an exploratory analysis of the spatial patterns of points, we initially converted the maps displayed in Fig. 4 into spatial densities by using a non-parametric kernel estimator. Figure 5 shows, for the two sectors of Electricity Electronics and Instruments, the kernel density estimation of the random process, say λ(s), s representing the vector of coordinates of the points, s ∈ ℜ2 . For the estimation we considered the following quadratic kernel (see Hastie and Tibshirani 1990; Fan and Gijbels 1996) λˆ τ (s) =

d j <τ

2 d 2j 3 1− 2 πτ2 τ

(3)

where τ represents the bandwidth (that is the parameter controlling the smoothness of the surface), dl is the distance between point s and the event located at point sl and the summation refers to all distances d j < τ . Interesting insights in terms of exploring the relationships between the two sectors are obtained by superimposing the two maps of points reported in Fig. 4a, b. In order to do this we exploited the following kernel estimator: λˆ τ ;i/1+2 (s) =

λˆ τ ;i (s) λˆ τ ;1+2 (s)

, i = 1, 2

where both λˆ τ ;i (s) and λˆ τ ;1+2 (s) were computed using expression (3). We will refer to this kernel estimation as to a dual kernel.

A class of spatial econometric methods

93

Fig. 6 Dual kernel estimation of the spatial density. The bandwidth parameter is τ = 35 km. a Electricity Electronics, b Instruments

Figure 6a, b display the output of the estimation procedure obtained for the joint distribution of the Electricity Electronics and Instruments sectors. In particular in Fig. 6a we display with a different shading the dual measures ranging from 0 (when there are only patents of sector Instruments) to the value of 1 (in the cases where there are only patents of the Electricity Electronics sector). A darker shading in Fig. 6a reveals the prevalence of patents of the Electricity Electronics sector whereas Fig. 6b displays the complementary information for the Instruments sector. In both graphs intermediate values in the scale allows us to identify areas where both sectors are present. Similar graphs were derived for all other pairs of sectors, but they are not reported here since they do not add particular insights. Concerning the identification of the bivariate map it seems more grounded the hypothesis of random labelling with respect to that of independence (see Sect. 3) in that the labelling of the patents to a certain sector occurs in a subsequent moment with respect to the moment of the deposit at the EPO. In the remainder of this section we will report the estimation results of the different bivariate versions of the K function that can be used to study the joint location of all pairs of sectors. To start with, Fig. 7 reports the behaviour of the L ∗ function for the two sectors Electricity and Electronics and Instruments. Similar graphs have been derived for all other pairs of sectors, but they are not reported here in that they display very similar behaviours.3 The effect of co-agglomeration is evident for all pairs of sectors at all distances in that all graphs lay entirely above the diagonal that represents the case of random labelling. Thus points are more clustered than expected under the null hypothesis of random labelling. In order to investigate more closely this effect of co-agglomeration significant insight can be gained by inspecting the behaviour of the K cross-functions. In fact, as already said, under the null hypothesis of random labelling we have that 3 All graphs are available upon request from the authors or can be accessed directly on the website. http://

www.springerlink.com/content/102505/.

96

G. Arbia et al.

Let us start commenting on Fig. 8a that refers to the joint pattern of location of the Electricity Electronics and Instruments sectors. The graphs of the functional Kˆ 11 (t) − ∗ (t) in Fig. 8a is always above the K = 0 horizontal line and is above the bands at 95% K 12 confidence level thus indicating significant attraction at all distances. In contrast the ∗ (t) suggests random labelling at small distances graph of the functional Kˆ 22 (t) − K 12 (t < 30 km) with the line falling entirely within the 95% confidence bands. Conversely, at higher distances, the graph falls in the lower part of the confidence bands suggesting repulsion between the two sectors. As a consequence the location of points referring to the two industrial sectors cannot be considered as randomly labelled, rather they display an interesting pattern of attraction–repulsion. The patents of the Electricity Electronics sector tend to locate close to patents of the same sector, whereas patents of the Instruments sector display (at small distances below 30 km) a tendency to locate in the neighbourhood of patents of Electricity Electronics. Such a gravitational effect is dominated by an internal segregation effect (above the mentioned threshold of 30 km) that prevents the patents of the Instruments sector to constitute patches. In summary: points of Electricity Electronics tend to cluster on one another while point of Instruments display a repulsion on one another and an attraction towards those of Electricity Electronic. The first effect thus confirms Marshall–Arrow–Romer theoretical expectations (Marshall 1890; Arrow 1962; Romer 1986) and Porter’s idea of dynamical externalities generated by specialization (Porter 1990). The second effect follows from the dynamic externalities à–la–Jacobs arising from industrial diversification (Jacobs 1969). By commenting jointly Figs. 7 and 8 we observe that the reciprocal aggregation effect between the Electricity Electronics sector and the Instruments sector suggested by the behaviour of the function L ∗12 (t) in Fig. 7, is not the same in both directions and it appears to be led by the Electricity Electronic sector. From an economic point of view such a results can be easily interpreted by observing that the Instruments sector includes goods whose production requires technologies linked to the Electricity Electronics sector. Thus the knowledge flows generated by firms of the Electricity Electronic sector produce a benefit to the neighbouring industries of the sector of Instruments. In contrast the Electricity Electronics sector includes goods whose production does not require technologies linked to the Instruments sector that most likely benefits from internally generated knowledge flows. In our work we derived the graphs of the functionals Kˆ ii (t) − K i∗j (t) and Kˆ j j (t) − ∗ K ji (t) for all 15 possible pairs of the 6 sectors considered. However here we report only some selected cases due to lack of space.5 By examining all possible pairs of sectors we can identify four different typologies of attraction–repulsion. The most common typology observed is the one that we have described into details when commenting on the relationships between the Electronic Engineering and the Instruments sectors. In fact this pattern is very similar to that displayed by the Electricity and Electronics sector on one side and the sectors Instruments, Chemical Pharmaceutical, Process Engineering and Mechanical Engineering Machinery on the 5 Again, as for the case reported in Fig. 7, the graphs related to all pairs of sectors are available on the

website http://www.springerlink.com/content/102505/.

A class of spatial econometric methods

97

other. A similar pattern is also displayed by the pair constituted by Instruments versus Chemical Pharmaceutical and, with some minor differences, by the pairs involving the Chemical Pharmaceutical sector on one side and Instruments, Process Engineering, Mechanical Engineering Machinery and Consumer goods Civil Engineering sectors on the other side. In this first dominating typology we observe clusters of points of one sector at small distances (between 30 and 50 km) co-existing with points of the second sector that are internally over-dispersed. At high distances points of the second sector become randomly labelled. Only in the case of the relationships between the sector of Chemical Pharmaceutical and Mechanical and Civil Engineering, we observe a tendency to cluster after 100 km. It is, however, important to remember that at these higher distances the number of points on which the estimation is based decreases dramatically and thus the estimates are less reliable due to the lack of degrees of freedom. This first typology well describes the stylized facts suggested by Duranton and Overman (2005). A second typology of attraction–repulsion is displayed by the pairs of sectors involving the relationships between the Instruments sector on one side and the sectors of Process Engineering, Mechanical Engineering Machinery and Consumer Goods and Civil Engineering on the other. As an example of this second typology, Fig. 8b displays the case of Instruments versus Process Engineering. Here the pattern displays clusters on one sector at small distances (less than 20 km) attracting a second sector that is also self-clustered. At higher distances we conversely observe segregation. A third typology is displayed by the pairs of points referring to the relationships between the Process Engineering sector on one side and the Mechanical Engineering Machinery and Consumer goods Civil Engineering sectors on the other. Figure 8c displays the case of Process Engineering versus Mechanical Engineering Machinery as an example of such typology. Here we notice a tendency to cluster for the points of one sector that also produces a strong attraction on the points of the other sector. At high distances (more than 150 km) we conversely observe repulsion in the pattern of the first sector. The effect of concentration of points of the first sector is persistent at all distances in the case of the patents of the Mechanical engineering Machinery sector versus those of the Consumer goods Civil Engineering sector. Finally we have a residual typology where we can classify the exceptions to the three typologies described above. These exceptions are represented by the patterns displayed by Electricity Electronics versus Chemical Pharmaceutical and by Process Engineering versus Consumer goods Civil Engineering. In the first instance we observe clusters of one sector at small distances co-existing with a second sector that appears to be randomly labelled. In the second instance we have clusters of both sectors only at intermediate distances ranging between 120 and 160 km. 4.3 Controlling for the underlying industrial concentration The functionals considered in the previous section can help in identifying bivariate clusters of firms, but they consider the space as homogeneous with all portions of space having a priori the same probability of hosting a point. Conversely the economic space is highly heterogeneous. On these basis Ellison and Glaeser (1997) suggest that, when

98

G. Arbia et al.

looking at the location patterns of firms, the null hypothesis should be that of spatial randomness, but only conditional upon both industrial concentration and overall agglomeration. Their index satisfies this requirement. Similarly Maurel and Sedillot (1999) and Devereux et al. (2004) develop indices with similar properties. Duranton and Overman (2005) notice that “unevenness does not necessarily mean an industry is localized” and translate these consideration into the formal requirement that “any informative measure of localization must control for industrial concentration” (p. 1078; see also Sect. 1 and Footnote 5). They go further in suggesting that not all points in the space can host a new point and requiring that “the set of all existing sites currently used by a manufacturing establishment constitutes the set of all possible locations for any point” (p. 1085). In this last section we wish to introduce the use of the bivariate K functions to fulfil Ellison–Glaeser requirement and we compare for each sector the actual pattern with the pattern generated by all patents considered as a whole. More in details we compute the differences between the bivariate K function for each sector on one side and the univariate K function computed considering all sectors together. Such a difference can help in identifying sectors that are overconcentrated (over-dispersed) not in absolute terms as in the traditional univariate K analysis (Marcon and Puech 2003a), but conditionally upon the spatial pattern displayed by the other firms for the economy as a whole. Of course we are aware that, for this analysis to be complete, we should consider the spatial pattern of all firms in the economy, but in the empirical analysis reported here we restrict ourselves to only the pattern of patents included in the EPO database to have at least some indications. Figures 9a–f report the results of this analysis based on the functional Kˆ ii − Kˆ tt , Kˆ tt representing the K function referred to the all patents considered as a whole. We have also computed the functionals Kˆ tt − Kˆ it , but they are not reported here because they provide very similar information.6 The exam of Fig. 9 reveals some interesting features. First of all, the patents of the Electricity Electronics and Chemical Pharmaceutical sectors display at all distances over-concentration with respect to the underlying global concentration of patents (see Fig. 9a, c). In particular Electricity Electronics presents high and increasing concentration up to 80 km whereas Chemical Pharmaceutical presents the maximum concentration at around 60 km and then a decreasing pattern and even randomness after 160 km. Such results parallel those of Duranton and Overman (2005) that found localization mostly at small scales (less than 60 km) and a general tendency for Chemical Pharmaceutical products to over-clustering. Secondly, the Instruments sector displays over-dispersion with respect to the underlying distribution of patents at distances greater than 50 km. Thirdly, the patents of the Process Engineering sector display a more complex pattern with significant over-clustering only in the interval between 70 and 140 km. Finally the Mechanical Engineering Machinery and the Consumer goods Civil Engineering sectors display random labelling at all distances (remember that estimates at high distances are less reliable due to the lack of degrees of freedom). Therefore in these two sectors there

6 These graphs, likewise those related to Figs. 7 and 8, are available at the website. http://www.springerlink.

com/content/102505/.

100

G. Arbia et al.

seems to be no specific tendency to either clustering of repulsion apart from those that are characteristic of the economy as a whole. 5 Summary and concluding remarks In this paper we extended the use of Ripley’s (1977) K functions previously considered in the economic literature by Arbia and Espa (1996), Quah and Simpson (2003), Marcon and Puech (2003a) and Duranton and Overman (2005) to the analysis of the joint spatial pattern of industries. By applying a methodology based on the bivariate cross-functions K to the spatial distribution of patents in Italy in the period 1995–1999, we have been able to discern quite distinct geographical patterns for the six sectors considered. Our main findings are the following: • The pattern displayed by the patents of all pairs of the six sectors considered is always of agglomeration when analysed in absolute terms looking at the standard bivariate K functions. • However, when looking more closely at the pair-wise relationships between the six sectors considered, a more differentiated situation emerges. In fact, most of the observed joint patterns (precisely 8 of the 15 pairs of sectors considered) display a situation of dominance of one sector on the other. This dominance assumes that there is a leading sector that is clustered in space at small distances (up to 50 km) and a second sector that is dispersed internally and clustered around the leader. In particular this is the pattern displayed by the patents of the Electricity Electronics and the Chemical Pharmaceutical sectors that act as leaders with respect to the other sectors. • Such a specificity of the Electricity Electronics and the Chemical Pharmaceutical sectors emerges also when considering the analysis of agglomeration conditional on the global concentration of patents in the economy as a whole. In fact, in this case, the mentioned sectors are the only two that present over-clustering with respect to the general pattern of all patents. In particular we notice a climax at around 80 km for the patents of the Electricity Electronics sector and at around 60 km for the patents of the Chemical Pharmaceutical sectors. Conversely for the patents belonging to the other sectors we record over-dispersion for the Instruments sectors and no significant departure from randomness in the remaining sectors. The analysis considered here has shown the importance, but also the limits of a static approach and the necessity to introduce temporal dynamics in order to reconstruct the whole process of individual choices behaviour. Thus an important step forward in the application of the spatial econometric techniques discussed here is represented by the introduction of the time dimension. In fact the analysis of the static bivariate K functions registers only the situation in one definite period of time and provides only a single snapshot of the whole dynamic process. Quite obviously, this snapshot can be of help in suggesting the generating mechanism of individual locational choices as it is realized in a dynamic context like a single photogram reveals something about the nature of the movie it is drawn from. For instance, the individual choice behaviour

A class of spatial econometric methods

101

of firms suggested by Fig. 8a could be interpreted as the process through which in a first moment industries of one sector (say Type 1) locate themselves in the space at random and industries of another sector (say Type 2) tend to locate around them to exploit technological and physical spillovers. If new Type 1 industries locate in the area, they tend to locate away from Type 2 industries creating a buffering zone that can be due to physical or economic constraints. This behaviour seems to suggest a leading position of Type 1 industries with respect to Type 2. This dynamic, however, describes only one of the possible behaviours, and more refined spatial laws of motion could be suggested by the analysis of proper dynamic K functions. The theoretical basis for considering dynamic spatial patterns in economic analysis are well depicted by Quah and Simpson (2003). Dynamic “space–time” K -functions (Diggle 1993; Diggle et al. 1995) can be conceived as functionals depending on both spatial distances and the time lag which indicate how many points characterized by a certain label fall within a certain distance of other points after a certain period of time. Such an analysis would greatly help the study of the concentration of industries and the analysis of diffusion processes which, in turn, are issues of tremendous importance when analysing sectoral growth and the rise and fall of regions within one country. Since understanding the dynamics of the spatial distribution of firms can also help to clarify the complex mechanisms of international trade, the development of this field appears to be as one of the most challenging in the future research agenda.

References Acs ZJ, Audretsch DB, Feldman MP (1994) R&D spillovers and recipient firm size. Rev Econ Stat 76(2):336–340 Arbia G (1989) Spatial data configuration in regional economic and related problems. Kluwer, Dordrecht Arbia G (2001) Modelling the geography of economic activities in a continuous space. Pap Reg Sci 80: 411–424 Arbia G (2006) Spatial econometrics: with applications to regional convergence. Springer, Berlin Arbia G, Espa G (1996) Statistica economica territoriale. Cedam, Padua Arbia G, Copetti M, Diggle PJ (2008) Modelling individual behaviour of firms in the study of spatial concentration. In: Fratesi U, Senn L (eds) Growth in interconnected territories: innovation dynamics, local factors and agents, Springer, Berlin (forthcoming) Arrow KJ (1962) The economic implications of learning by doing. Rev Econ Stud 155–173 Audretsch DB, Feldman MP (1996) R&D Spillovers and the geography of innovation and production. Am Econ Rev 86(3):630–640 Bairoch P (1988) Cities and economic development: from the dawn of history to the present. Chicago Press, Chicago Barnard GA (1963) Contribution to the discussion of Professor Bartlett’s paper. J R Stat Soc B25:294 Besag J (1977) Contribution to the discussion of Dr. Ripley’s paper. J R Stat Soc B 39:193–195 Besag J, Diggle PJ (1977) Simple Monte Carlo test for spatial pattern. Appl Stat 26:327–333 Breschi S, Lissoni F (2001) Knowledge spillovers and local innovation systems: a critical survey. Ind Corp Change 10(4):975–1005 Breschi S, Lissoni F (2006) Cross-firm inventors and social networks: localised knowledge spillovers revisited. Ann Econ Stat Ciccone A, Hall RR (1996) Productivity and the density of economic activity. Am Econ Rev 86(1):54–70 Cliff AD, Ord JK (1981) Spatial processes: models and applications. Pion, London Devereux MP, Griffith R, Simpson H (2004) The geographic distribution of production activity in the UK. Reg Sci Urban Econ 34:533–564 Diggle PJ (1983) Statistical analysis of spatial point patterns. Academic, New York

102

G. Arbia et al.

Diggle PJ (1993) Point process modelling in environmental epidemiology. In: Barnett V, Turkman KF (eds) Statistics for the environment. Wiley, Chichester Diggle PJ, Chetwynd AG (1991) Second-order analysis of spatial clustering. Biometrics 47:1155–1163 Diggle PJ, Chetwynd AG, Haggkvist R, Morris S (1995) Second-order analysis of space−time clustering. Stat Methods Med Res 4:124–136 Dixon PM (2002) Ripley’s K function. In: El-Shaarawi AH, Piegorsch WW (eds) Encyclopedia of environmetrics, vol 3. Wiley, Chichester, pp 1796–1803 Driffield N (2006) On the search for spillovers from Foreign Direct Investment (FDI) with spatial dependency. Regional Studies 40, pp 107–119 Duranton G, Overman HG (2005) Testing for localisation using micro-geographic data. Rev Econ Stud 72:1077–1106 Ellison G, Glaeser EL (1997) Geographic concentration in U.S. manufacturing industries: A dartboard approach. J Pol Econ 105(5):889–927 Fan J, Gijbels I (1996) Local polynomial modelling and its applications. Chapman and Hall, London Fujita M, Krugman P, Venables A (1999) The spatial economy: cities, regions, and international trade. MIT Press, Cambridge Gatrell AC, Bailey TC, Diggle PJ, Rowlingson BS (1996) Spatial point pattern analysis and its application in geographical epidemiology. Trans Inst Br Geogr 21:256–274 Glaeser EL, Kallal HD, Scheinkman JA, Schleifer A (1992) Growth in cities. J Pol Econ 100(6):1126–1152 Goreaud F, Pélissier R (1999) On explicit formulas of edge effect correction for Ripley’s K-function. J Veg Sci 10:433–438 Goreaud F, Pélissier R (2003) Avoiding misinterpretation of biotic interactions with intertype K 12 (t)− function: population independence vs. random labelling hypotheses. J Veg Sci 14:681–692 Griliches Z (1979) Issues in assessing the contribution of research and development to productivity growth. Bell J Econ 10:92–116 Haining RP (2003) Spatial data analysis: theory and practice. Cambridge University Press, Cambridge Hastie T, Tibshirani R (1990) Generalized additive models. Chapman and Hall, London Henderson JV (2003) Marshall’s scale economies. J Urban Econ 53(1):1–28 January Henderson VA, Kunkoro M (1995) Turner. Industrial development of cities. J Pol Econ 103:1067–1090 Ioannides YM, Overman HG (2004) Spatial evolution of the US urban system. J Econ Stud 4:131–156 Jacobs J (1969) The economy of cities. Random House, New York Jacobs J (1984) Cities and the wealth of nations: principles of economic life. Vintage, New York Jaffe AB (1989) Real effects of academic research. Am Econ Rev 79(5):957–970 Jaffe AB, Trajtenberg M (1996) Flows of knowledge from universities and federal laboratories: modeling the flow of patent citations over time and across institutional and geographic boundaries. In: Proceedings of national academy of science vol 93, pp 12671–12677 Jaffe AB, Trajtenberg M (2002) Patents, Citations and innovations: a window on the knowledge economy. MIT Press, Cambridge Jaffe AB, Trajtenberg M, Henderson R (1993) Geographic localization of knowledge spillovers as evidenced in patent citations. Q J Econ 108(3):577–598 Jaffe AB, Trajtenberg M, Fogarty M (2000) The meaning of patent citations: report of the NBER/Case western reserve survey of patentees. NBER working paper No. 7631 Krugman P (1991) Geography and trade. MIT Press, Cambridge Kulldorff M (1998) Statistical methods for spatial epidemiology: test for randomness. In: Gatrell A, Löytönen M (eds) GIS and health. Taylor & Francis, London, pp 49–62 Lotwick HW, Silverman BW (1982) Methods for analysing spatial processes of several types of points. J R Stat Soc B44:406–413 Lucas RE (1988) On the mechanics of economic development. J Monet Econ 22(1):3–42 Marcon E, Puech F (2003a) Evaluating geographic concentration of industries using distance-based methods. J Econ Geogr 3(4):409–428 Marcon E, Puech F (2003b) Generalizing Ripley’s K function to inhomeogeneous populations. mimeo Marshall A (1890) Principles of economics. Macmillan, London Martens SN, Breshears DD, Meyer CW, Barnes FJ (1997) Scales of above-ground competition in semi arid woodland detected from spatial patterns. J Veg Sci 8:655–664 Maurel F, Sedillot B (1999) A measure for geographical concentration of French Manufacturing industries. Reg Sci Urban Econ 29(5):575–604 Porter ME (1990) The competitive advantage of nations. Free Press, New York

A class of spatial econometric methods

103

Quah D, Simpson H (2003) Spatial cluster empirics. mimeo Rauch JE (1993) Productivity gains from geographic concentration of human capital: evidence from the cities. J Urban Econ 43(3):380–400 Ripley BD (1976) The second-order analysis of stationary point processes. J Appl Probab 13:255–266 Ripley BD (1977) Modelling Spatial Patterns (with discussion). J R Stat Soc B39:172–212 Ripley BD (1979) Test of ‘randomness’ for spatial point pattern. J R Stat Soc B41:368–374 Ripley BD (1981) Spatial Statistics, Wiley Romer PM (1986) Increasing returns and long–run growth. J Pol Econ 94:1002–1037 Yule GU, Kendall MG (1950) An introduction to the theory of statistics. Griffin, London

A spatially filtered mixture of β-convergence regressions for EU regions, 1980–2002 Michele Battisti · Gianfranco Di Vaio

Abstract Assessing regional growth and convergence across Europe is a matter of primary relevance. Empirical models that do not account for structural heterogeneities and spatial effects may face serious misspecification problems. In this work, a mixture regression approach is applied to the β-convergence model, in order to produce an endogenous selection of regional growth patterns. A priori choices, such as North– South or centre-periphery divisions, are avoided. In addition to this, we deal with the spatial dependence existing in the data, applying a local filter to the data. The results indicate that spatial effects matter, and either absolute, conditional, or club convergence, if extended to the whole sample, might be restrictive assumptions. Excluding a small number of regions that behave as outliers, only a few regions show an appreciable rate of convergence. The majority of data show slow convergence, or no convergence at all. Furthermore, a dualistic phenomenon seems to be present inside some States, reinforcing the “diverging-convergence” paradox. Keywords Regional growth · Convergence patterns · Mixture regression · Spatial effects JEL Classification C21 · O40 · R11

A previous version of the paper was presented at the International Workshop on Spatial Econometrics and Statistics, May 25–27, 2006, Rome, Italy. We wish to thank all the participants for their useful comments. Our acknowledgements to Rolf Turner and two anonymous referees for technical advice. M. Battisti · G. Di Vaio (B) Dipartimento di Scienze Economiche e Aziendali, LUISS Guido Carli, Viale Pola 12, 00198 Rome, Italy e-mail: [email protected]

106

M. Battisti, G. Di Vaio

1 Introduction Assessing regional convergence across Europe, in terms of per capita income or product, is a relevant matter, not only to verify what the growth theories predict, but also to evaluate the effectiveness of the Cohesion Policies. The expectations of New Entrants, indeed, require feasible answers from the policy makers. After the recent enlargement, economic disparities increased dramatically. The Union’s ten richest regions each have a GDP equal to 189% of the EU-25 average, while for the ten poorest regions this indicator is equal only to 36%. In the New Member States, 90% of the total population live in regions with a per capita level of income below 75% of the Community average, which is the admissibility threshold to receive the Objective 1 Structural Funds (Commission of the European Communities 2005). In the last few years, there has been much debate on European integration, regional growth and convergence, and on Cohesion Policies (for a review see Funck and Pizzati 2003). On the one hand, the EC stresses the gains of integration and the positive role of regional policies, which sustain the economic growth of the regions lagging behind (European Commission 2001, 2004). On the other hand, some scepticism has been voiced (Boldrin and Canova 2001) and, consequently, some questions have been raised. Will EU citizens see their welfare equalised to Community averages? Or will their standard of living fall, subjected to growing inequalities? Will cohesion policies have a positive impact on growth and convergence? Or will these policies be ineffective, serving mainly as instruments of redistribution? In this work we examine the regional convergence process across Europe, focusing on some relevant issues linked to the empirical analysis. In particular, we show that misspecification sources need to be carefully taken into account. The remainder of the paper is structured as follows: Sect. 2 reviews the literature on the European regional convergence and poses the problem of heterogeneity and spatial effects. Section 3 highlights the existence of spatial dependence among EU regions and, at the same time, presents the data. Section 4 introduces the proposed methodology. Section 5 shows the results and, finally, Sect. 6 gives the conclusions.

2 The empirics of regional convergence Convergence hypotheses across countries or regions have been subjected to several theoretical interpretations. Following the taxonomy originally proposed by Galor (1996), three different definitions can be identified: (a) unconditional or absolute convergence, meaning that per capita incomes converge to a common level in the long-run, if structural homogeneities exist across the economies and their initial conditions do not matter;1 (b) conditional convergence, meaning that per capita incomes converge to different levels in the long-run, if structural heterogeneities exist across the economies and their initial conditions do not matter; (c) club convergence, meaning that per capita 1 This hypothesis seems to be the one that the EC is interested in, as Quah (1996b, p. 1048, note 4) already

pointed out.

A spatially filtered mixture of β-convergence regressions for EU regions

107

incomes converge to different levels in the long-run, if structural heterogeneities exist across the economies and their initial conditions do matter. Much of the empirical analysis aimed at testing the validity of these hypotheses has been based on the measure of β-convergence, derived from the neoclassical growth model of Solow–Cass–Koopmans (see Barro and Sala-i-Martin 2004). As is well known, the measure refers to the tendency for poor economies to grow faster than rich ones, i.e. to “catch up”, and is related to ranking dynamics in the sample distribution (Sala-i-Martin 1996a). Other measures are often used, which look at the reduction of distributional dispersion over time, such as sigma-convergence2 (Barro and Sala-i– Martin 1992), or the evolution of the entire distribution, such as the transition matrix approach3 (Quah 1993a, 1996a). These concepts, however, are less suited to the questions assessed in this work and to the methodology adopted, hence we prefer to focus on β-convergence. Empirical β-convergence models usually take the form of a cross-country/region growth regression gi = a − bxi + u i , i = 1, . . . , n,

(1)

where gi = [ln(yi,T ) − ln(yi,0 )]/T is the average growth rate of the economy i’s per capita income between time 0 and T, a is a constant, b = (1 − e−βT )/T is a convergence coefficient, xi = ln(yi,0 ) is the log of the economy i’s initial level of per capita income, and u i ∼ N (0, σ 2 ) is an error term with the usual properties (see Durlauf et al. 2005). A positive value of the parameter β is supportive of convergence, and provides the rate at which the economy approaches the steady-state.4 Early empirical studies (Barro and Sala-i-Martin 1991; Sala-i-Martin 1996b) estimated Eq. (1) without control variables, based on two strong homogeneity assumptions. First, the constant term was considered inclusive of technological progress (γi ), the steady-state value of effective per capita output ( y˜i∗ ), and initial efficiency (Ai,0 ), namely (i) a = ai = γi + (1 − e−βi T )/T · ln(Ai,0 · y˜i∗ ), ∀i. Second, the convergence coefficient was considered constant across the economies, that is (ii) b = bi = (1 − e−βi T )/T, ∀i. Equation (1) estimated with assumptions (i) and (ii) can be seen as a test for type (a) convergence, a positive βˆ implying that poor regions unconditionally grow faster 2 As random shocks may produce criss-crossing or overshooting effects, β-convergence is a necessary but not sufficient condition for sigma-convergence (see Barro and Sala-i-Martin 2004). 3 Some researchers prefer this approach, since it provides a more complete set of information.

β-convergence, in fact, suffers from the so-called “Galton’s fallacy”, hence it may be consistent with a stationary distribution over time (Quah 1993b; Hart 1995). 4 Estimates are usually obtained calculating bˆ by ordinary least squares (OLS), and re-parameterizing ˆ )/T . Estimation may also be done with non linear least squares (NLS). However, the use βˆ = − ln(1 − bT of one technique rather than another does not lead to appreciable statistical discrepancies (on this point, see Abreu et al. 2005a).

108

M. Battisti, G. Di Vaio

than rich ones, at the same rate, towards a unique steady-state independently from the initial conditions. Barro and Sala-i-Martin (1991, p. 154), analysing a sample of 73 Western European regions over the period 1950–1985, found an “empirical regularity that the rate of β convergence is roughly 2% a year in a variety of circumstances [. . .] the half-life of this convergence process is 35 years”.5 In the last edition of their book, they concluded that “absolute β convergence is the norm for these regional economies” (Barro and Sala-i-Martin 2004, p. 496). Tests of absolute β-convergence are plausible when the object of study is withincountry convergence. In that case, regional economies share common steady-states, being affected by similar saving rates, preferences, governmental policies, property rights, infrastructures, and so on. In the case of between-country convergence, however, type (a) convergence appears quite unrealistic, since regions belonging to different countries may not show a common steady-state. As Solow (1999, p. 640) argued, “there is nothing in growth theory to require that the steady-state configuration be given once and for all [. . .] the steady-state will shift from time to time [and, we say, from space to space] whenever there are major technological revolutions, demographic changes, or variations in the willingness to save and invest”. If the determinants of steady-state are not constant across regions, it follows that ai = a, ∀i, leading assumption i) to fail.6 Mankiw et al. (1992) solved the problem by relaxing assumption (i) in two ways. First, in the presence of heterogeneity, y˜i∗ can be included among a set of control variables added to Eq. (1). Second, if Ai,0 reflects not only the initial technology, but also resource endowment, climate, institution, and other region-specific factors affecting growth, it may be constituted by a common and a random component, ln(Ai,0 ) = ln(A0 ) + ei , where A0 is the common factor and ei is the specific effect. The error term is now equal to u i = (1 − e−βT )/T · ei + εi , y˜i∗ being independent of the error term. Assumption (i) is hence replaced by (iii) a = ai = γi + (1 − e−βi T )/T · ln(A0 ), ∀i. Equation (1), with the assumptions (ii) and (iii), and the inclusion of control variables, implies homogeneity in the convergence parameter, the initial efficiency, and the technological progress. Steady-states determinants, on the contrary, are allowed to be heterogeneous. Estimation can be considered a test for type (b) convergence, with a positive βˆ meaning that poor regions grow conditionally faster than rich ones, at the same rate, towards different steady-states. Armstrong (1995), testing for absolute and conditional convergence, either within or between-country, on a sample of 85 EU regions over the period 1950–1990, found significant discrepancies between the two hypotheses. The rate of between-country absolute convergence was about 1% per annum, much slower than the 2% found by previous studies. In fact, a rate of 2% was only found during the post-War period, for 5 The so-called half-life condition is given by e−βT = 1/2 ⇒ T = ln(2)/β. If the speed of convergence

is equal to 2% per year, it follows that T ∼ = 0.69/0.02 ∼ = 35, hence the economy fills half the gap in about 35 years.

6 Steady-state variables might be comprised in the error term, u = (1 − e−βi T )/T · ln( y˜ ∗ ) + ε , where i i i εi is a random component. However, if those variables were related to initial income levels, and they had ∗ an impact on growth, y˜i would be an omitted variable and the coefficient bˆ would be biased (Bernard and

Durlauf 1996).

A spatially filtered mixture of β-convergence regressions for EU regions

109

within-country conditional convergence.7 On the other hand, the years following the oil crisis saw a decrease of annual convergence rates ranging from 0.8 to 1%. Type (a) and (b) convergence tests, however, have been criticized under many respects. From a general cross-country perspective, parameters such as heterogeneity, outliers, and measurement errors have been highlighted (Temple 1998, 2000). Looking at the European regional experience, some researchers (Martin 1998; Petrakos et al. 2005) sustained that the convergence process does not follow a homogeneous pattern of growth.8 In this case, testing for types (a) or (b) convergence would be misleading, if the “true” convergence process saw the regions converging at different rates towards different income levels. The existence of structural heterogeneities may be compatible, for instance, with the presence of multiple regimes in cross-country growth behaviour identified by Durlauf and Johnson (1995), or with the convergence clubs—the “twin-peaks”—in world income distribution detected by Quah (1997). On the one hand, country-specific constraints on the adoption of technologies may affect the efficiency of regional economies and produce structural heterogeneities, as verified world-wide by Durlauf et al. (2001). In this case, assumption (iii) does not hold, because (iv) a = ai = γi + (1 − e−βi T )/T · ln(A0 ), ∀i. On the other hand, growth models similar to the one developed by Azariadis and Drazen (1990) assume that spillovers due to physical or human capital accumulation cause threshold effects, which produce shifts in the aggregate production function, leading to multiple, locally stable, steady-state equilibria—i.e. to different convergence “clubs” (see Durlauf and Quah 1999). A threshold value in the income level, y¯ , implies βi = β1 , if yi,0 ≺ y¯ , and βi = β2 otherwise. Hence, assumption ii) needs to be replaced by (v) b = bi = (1 − e−βi T )/T, ∀i. If initially poor regions converge towards a lower income level,9 then estimates of Eq. (1) with assumptions (iv) and (v) can be seen as tests of type (c) convergence. A positive βˆ indicates that poor regions grow faster than rich ones, at different rates, towards different steady-states depending on their initial conditions. The existence of club convergence across Europe has been recognized by several authors. Early studies imposed exogenous assumptions on the number of clubs, to emphasize geographical and distributional factors, such as North–South, centre– periphery, or rich–poor divisions. Neven and Gouyette (1995) split a sample of 142 EU regions, over the period 1980–1989, into a Northern and a Southern club. They found a very low rate of 0.53% absolute convergence for the whole sample, and no statistically significant convergence inside either of the two clubs. Only when countryspecific effects are controlled for, the rate of convergence takes on significant values, comprised between 1.1 and 1.8%. 7 Country-specific dummies are used to control for heterogeneity in steady-states. The common practice

of employing dummy variables is due to lack of data at a regional level. 8 Convergence between States—towards the outside—but not within—towards the inside—has been

defined as the “diverging-convergence” phenomenon. 9 Falling into a “poverty trap”.

110

M. Battisti, G. Di Vaio

One should, however, consider more fully endogenous criteria to detect the presence of clubs, rather than making exogenous choices which arbitrarily assign structural heterogeneities to different clubs. Canova (2004), for instance, adopted a predictive density approach to find convergence clubs in a sample of 144 EU regions, over the period 1980–1992. Avoiding a priori assumptions, he identified four homogeneous clubs, with different convergence rates and steady-states values highlighting North– South or poor–rich dimensions, the initial conditions influencing the probability of belonging to a club. Furthermore, many authors have argued that, due to geographical spillovers, the distribution of regional per capita income across Europe tends to be influenced by physical location (Quah 1996c; López-Bazo et al. 1999; Le Gallo and Ertur 2003). Ertur et al. (2006) treated spatial problems in the context of club convergence. In a sample of 138 EU regions, over the period 1980–1995, they found that per capita income levels were highly spatially correlated. In particular, an exploratory spatial analysis (ESDA) revealed a division between Northern-rich regions and Southern-poor ones. Assuming the existence of heterogeneity across two different spatial regimes, and taking the spatial autocorrelation into account, they found no convergence in the Northern club, and an annual convergence rate of 2.9% in the Southern one. The procedure followed by Ertur et al. (2006) is based on an exogenous assumption that structural parameters are heterogeneous across regions, due to their geographical locations. To our knowledge, a procedure that merges together an endogenous identification of convergence paths and spatial dynamics is not yet available either in the theoretical or the empirical literature. Working in this direction, we implement a strategy that leads to an endogenous selection of convergence regimes once spatial dependence effects have been taken into account. It can be considered a first step for future research. 3 Spatial dependence across European regions Spatial dependence, if not properly modelled, leads to serious misspecification problems in linear regressions (Anselin 1988, 2001; Anselin and Bera 1998). In the crosssectional growth framework, in which the observations are spatially organized, the existence of geographical spillovers may violate the assumption that the error terms from neighbouring regions are independent (Rey and Montouri 1999). The common practice is to explicitly incorporate in the regression a spatial component, in the form of a spatial error or a spatial lag (Arbia 2006). Another approach, as we will see later, is to filter out the spatial dependence. A simple check of spatial dependence, in its weaker version of spatial autocorrelation, can be performed by means of Moran’s I statistic. As is well known, the statistic can be expressed as n n n i=1 j=1 wi j xi x j n n , I = q i=1 j=1 x i x j

where wi j is an element of a binary spatial weight matrix W, xi is a specific variable for observation i, n is the number of observations, q is a scaling factor equalling the

A spatially filtered mixture of β-convergence regressions for EU regions

111

sum of all the elements of the matrix. In this paper we use a row-standardized binary matrix, based on the k-nearest neighbouring regions, the elements of which are ⎧ ⎪ ⎨ wi j (k) = 0 if i = j wi j (k) = 1 if di j ≤ di (k) ⎪ ⎩ wi j (k) = 0 if di j > di (k)

where di (k) is a critical cut-off distance, defined for each observation i, ensuring that every single region of the sample has the same number (k) of neighbours. Abreu et al. (2005b) has shown that contiguity-based matrices are the most popular choice in the literature. In the case of the European regions, however, these kind of matrices leave the islands unconnected to the continent, hence distance-based specifications have been preferred in applied works (Le Gallo and Ertur 2003; Le Gallo and Dall’erba 2006). We chose a specification with k = 20, since it cancels out the spatial autocorrelation in the filtered series.10 Such a specification is also able to link Cyprus with the Greek regions, which in turn are connected to Italy; Ireland with the UK, which is connected to continental Europe; Sicily and Sardinia with continental Italy; Corsica with the continental French regions.11 Data on per capita GDP, expressed in Euros at 1995 prices, are taken from Cambridge Econometrics, European Regional Database, 2004. We work with two samples. The larger sample includes 242 NUTS-2 regions from EU-25,12 covering the period 1991–2002, while the smaller one comprises 190 NUTS-2 regions from EU-15, covering the period 1980–2002. Figure 1 shows standardized scatter-plots for the smaller sample based on Moran’s I of (a) the log of per capita GDP, and (b) the average growth rate of per capita GDP.13 A highly positive spatial correlation of per capita income levels among the European regions is clearly evident. The majority of the observations fall into the high–high (HH) or the low–low (LL) quadrant. In fact, rich (poor) regions are surrounded by rich (poor) regions. Spatial correlation of growth rates is also positive, albeit in a weaker form. Over the whole period, the spatial dynamic of the growth rates is unable to offset the spatial concentration of the economic activity. At the end of the period, the physical location of income levels is agglomerated as it was at the beginning. Moran’ s I of per capita GDP in 2002 equals 0.6, only 1% point less than its value in 1980. From the above statistics, we would expect that spatial dependence matters for the study of β-convergence in Europe. As a preliminary analysis we estimate Eq. (1) for both samples by standard OLS, and test the existence of spatial autocorrelation among the regression residuals. We split the EU-15 sample into two sub-periods of equal length, 1980–1991 and 1991–2002. The breakdown is useful for at least two 10 However, other weight matrices, with k = 10, 15, produced very similar results in the mixture. See the next section for the filtering procedure. 11 The weight matrix was obtained using the GeoDa software package (Anselin 2005). 12 Inclusive of German Ex-Länder. The list of regions is available upon request. 13 To save space, we do not show here the figures for the larger sample. The results, however, are very

similar.

112

M. Battisti, G. Di Vaio Moran's I: 0.2

3

W Log of Per Capita GD P 1980

LH

HH

b)

2

W Grow th R ate 1980-2002

Moran's I: 0.7

a)

1

LH

HH

LL

HL

2 1 0 -1 -2 LL

-3 -3

HL -2

-1

0

1

2

0

-1

-2 3

-6

Log of Per Capita GDP 1980

-4

-2

0

2

4

6

Growth Rate 1980-2002

Fig. 1 Moran scatterplots (standardized) of a per capita GDP, 1980, b average growth rate of per capita GDP, 1980–2002. 190 NUTS-2 regions from EU-15. Data Source Cambridge Econometrics, European Regional Database, 2004 Table 1 Absolute β-convergence EU-15

EU-25

1980–2002

1980–1991

1991–2002

1991–2002

0.076 (0.011)

0.040 (0.016)

0.102 (0.015)

0.093 (0.012)

0.006 (0.001)

0.002 (0.002)

0.009 (0.002)

0.008 (0.001)

108 years

310 years

69 years

79 years

R2

0.12

0.01

0.15

0.17

Log-likelihood

686

623

642

700

Obs

190

190

190

242

aˆ bˆ

Half-life

Diagnostics for spatial autocorrelation Moran’s I err-u*

0.12 (0.00)

0.16 (0.00)

0.17 (3.00)

0.07 (0.00)

Moran’s I err-f*

−0.03 (0.24)

−0.01 (0.87)

−0.02 (0.42)

−0.02 (0.44)

* P values within parentheses Standard errors within parentheses. OLS estimates and diagnostics for spatial autocorrelation Data source Cambridge Econometrics, European Regional Database, 2004

reasons. First, in the Nineties major institutional changes, such as the implementation of Cohesion policies and the establishment of the adhesion criteria to EMU, may have had an impact on the convergence process of the EU regions. Second, such a breakdown makes it possible to compare the smaller sample with the larger one, to see if the regions which first joined the Union experienced different convergence patterns. The results are shown in Table 1. Over the whole period, the convergence coefficient for the EU-15 sample is highly significant. Its magnitude, however, is only about one fourth of the “empirical norm” of 2%. Actually, the convergence rate equals about 0.6% per annum, leading to a

A spatially filtered mixture of β-convergence regressions for EU regions

113

half-life of 108 years.14 The poorest regions are thus supposed to fill half the gap with the richest ones in over a century. Looking at the two sub-periods, it can be clearly seen that the bulk of convergence is given by 1990s. During 1980s, the convergence coefficient is very slow and not statistically significant, pointing to a lack of convergence. Enlarging the sample to the EU-25 regions does not change the picture very much, since convergence decreases only slightly when compared to the EU-15 sample. Finally, the bottom part of Table 1 shows the spatial autocorrelation diagnostics. The test is based on Moran’s I statistic applied to regression residuals. Under some regularity conditions, the distribution of the test corresponds to the standard normal (Anselin, 1988). The results highlight a significant spatial autocorrelation among the residuals,15 for both samples, either over the whole period or for the two subperiods. Interestingly, during 1990s, the spatial dependence is higher for the EU-15 regions, meaning that per capita income is less concentrated in the enlarged Europe. This preliminary spatial analysis shows that spatial phenomena can be relevant for the study of EU regional convergence, and that they should be taken into account in the cross-sectional framework, in order to avoid misspecification problems. Absolute β-convergence, tested by standard OLS with no spatial specification, suffers from many shortcomings which invalidate its ability to explain the regional growth processes.

4 The spatially filtered mixture of regressions Given the spatial influence highlighted in the previous section, our interest here lies in an endogenous determination of heterogeneity in regional convergence patterns, once the spatial dependence in the data has been properly treated. We avoid a priori restrictions, such as geographical (North–South or centre–periphery), or exploratory (based upon spatial association indices) divisions. To this end, we use a spatially filtered mixture regression approach (for mixture densities, see Titterington et al. 1985; Wedel and Kamakura 1998; MacLachlan and Peel 2000). Previous attempts to apply mixture densities, or mixture regressions, to convergence analysis are found in the works of Paap and Van Djik (1998), Tsionas (2000), and Bloom et al. (2003). Those studies, however, do not deal with spatial related questions, and differ from ours as regards several other aspects. Let us begin by considering spatial dependence. As we have seen in the previous section, if spatial dependence is present across a sample, OLS estimates are biased or not efficient. In the case of spatial effects influencing the errors in Eq. (1), statistical inference based on OLS is not reliable, because assumption of errors independence from neighbouring regions may be violated. We treat potential sources of misspecification in the β-convergence framework by isolating the spatial correlation by means of 14 The result may not be very interesting in terms of policy implications. On how to calculate βˆ see note 4 in Sect. 2. 15 The test was carried out on the residuals of the unfiltered series (err-u), as well as on the filtered series

(err-f) obtained with the filtering procedure described in the next section. Once spatial correlation is filtered out from the series, the test become not significant.

114

M. Battisti, G. Di Vaio

a local filter.16 Two filtering procedures have been shown to give the same empirical results in “cleaning up” the spatial effects from geographically organized variables (Getis and Griffith 2002). We use the Getis local filter Gi(d) (Getis 1995) constructed as, for each spatial unit i, n j=1 wi j (d)x j n , i = j G i (d) = j=1 x j

where x is the original unfiltered variable and wi j is the element of the spatial weight matrix related to the j neighbouring regions comprised within the distance threshold d. The filtered variable x F (where F stands for filtered) can then be obtained as n w (d) i j j=1 F G i (d), xi = xi (n − 1)

while the residual spatial component can be defined as x S = x − x F (S stands for the spatial component). Applying this filter to the series makes the OLS estimates consistent. We test different specifications of W (k = 10, 15, 20) until Moran’s I on regression residuals becomes not significant. The specification with k = 20 gives the desired outcome.17 A similar two-step procedure, with a spatial filtering in the first step and a panel regression in the second, is implemented, for example, by Badinger et al. (2004). Our approach uses the same first step, where the spatial dependence is “cleaned up”, while the second step is based on the application of the mixture regression model to the filtered variables. At the end of the procedure we obtain a transformed version of Eq. (1) giF = a − bxiF + u i

(2)

the least squares estimation of which is consistent.18 In the second step we employ the mixture regression model to detect the existence of regional convergence patterns. Assume that the “true” density function of a population is a mixture of several functions, one for each pattern with different parameters, weighted by the probability of belonging to a specific pattern. If the population is divided into k groups,19 the number of groups being unknown and the sum 16 In a previous version of this work we used a global filter obtained by means of a spatial parameter

estimated in a spatial error model (see Anselin and Bera 1998). Results about convergence rates and patterns identification were quite similar, but that procedure is less consolidated in the literature and it causes inference problems in the second step deriving from estimating mixtures of equation with generated regressors (see Pagan 1984). 17 See the row labelled Moran’s I err-f in Table 1. 18 We estimate Eq. (1) without control variables, because we admit heterogeneity in steady-states, allowing

for different intercepts across clubs. 19 We refer to patterns, groups, or regimes without distinction.

A spatially filtered mixture of β-convergence regressions for EU regions

115

of the probabilities of belonging to one of these being equal to one for each observation, then, according to the total probability theorem, the conditional distribution function is f (giF |θ) =

k s=1

ψs f s (giF |θs )

(3)

where giF is the dependent variable, ψs is the probability of belonging (a priori) to a regime s, with s = 1, . . . , k, and θ is a vector of parameters. Once ψs has been estimated, the subsequent probability that observation i comes from s has to be computed by Bayes theorem. Consider the function of the filtered variable giF as normal. The density function conditional to belonging to the regime k is f (giF |s = k, θ) = (2π σk2 )1/2 e−

2 giF −(ak −bk xiF ) /2σk2

.

(4)

In this way, the component represented by (ak − bk xiF ) gives us a linear predictor that replaces the population mean of the group. From Bayes rule, it is straightforward to extract the unconditional probability of giF for s = k as a joint probability, that is the product of the conditional probability and the marginal probability of belonging to a club. This latter is equal to ψk , hence the joint probability is ψk f (giF |s = k, θ). Summing all the values of s gives the unconditional density of giF f (giF , θ) =

k s=1

ψs (2π σs2 )1/2 e−

2 giF −(as −bs xiF ) /2σs2

.

(5)

The vector of parameters θ, which also contains the weights ψs , is unknown. A simple way to solve this type of missing data problem is through the Expectation– Maximization (EM) algorithm. The solution is to find an initial value of the parameters, then compute the density for these parameters, and re-compute the final θ, by maximization of the log-likelihood. The algorithm therefore has two alternated steps: in the first (expectation), it computes the density function for the chosen parameters, while in the second (maximization), it derives the estimation of the parameters as , bs , and σs2 . In the case of a linear mixture regression, De Sarbo and Cron (1988) show how the second step is equivalent to performing k weighted least squares regressions, where the weights are the roots of the probabilities of belonging to a club. We begin with random starting probabilities, then we update the probabilities step by step. This strategy could have two types of shortcoming. First, the results might depend on the initial probabilities. Second, the maximization of the log-likelihood could converge in a local optimum. To avoid these problems, and to be reasonably confident that our estimates do not correspond to a local maximum, we tried 500 different starting values. We chose the highest value of log-likelihood, which also helps to determine the number of components in the mixture, as we will see below.

116

M. Battisti, G. Di Vaio

Finally, each region is attributed to a regime, if the probability of belonging to that regime is higher than the probability to belong to the others. A last point refers to inference considerations. Since standard errors in the EM algorithm are not used to iterate (Wedel and Kamakura 1998), when the algorithm converges to the final value, an estimation of the covariance matrix is given by the Fisher information matrix. In the case of the EM algorithm, Louis (1982) finds this observed information matrix as the difference between two matrices—the total information matrix and the missing data information matrix. Turner (2000) shows computational details for mixture regression models.20 5 Results This section shows the results obtained with the two-step methodology described above. The choice of the mixture’s components number is based upon two complementary rules. On the one hand, we look at the improvements given by additional components in the log-likelihood, through a set of information criteria. On the other hand, we consider a meaningful interpretation of the data. The latter rule regards either the existence of appreciable differences in the significance of the parameters, or the certainty of the regions’ attribution to specific regimes.21 The logic of the former rule is to penalize the increase of the components in the mixture, in order to avoid an excessive—and useless—number of parameters. The AIC (Akaike Info Criterion) is the less restrictive criterion, so it generally selects less parsimonious specifications. On the contrary, the BIC (Bayesian Info Criterion) is the most restrictive one. The MAIC (Modified Akaike Info Criterion) falls in-between (for a detailed description, see Hawkins et al. 2001). According to the criteria reported in Table 2, a specification with only one component is a choice not supported by the data, while a three-components specification is the better approximation for all the samples, except for the EU-15, 1991–2002, for which all the criteria indicate a choice of two regimes.22 Table 3 reports the results obtained by the mixture regression, applied to the spatially filtered variables. Generally, the two-components specification selects a small group with very fast convergence rates, ranging from about 1.3–3.2%, becoming in the Nineties a divergence regime. Such a group, comprising from 12 to 26% of the data, seems to behave like an “outliers bin”, since it collects regions with particular growth experiences. The majority of the regions fall into a regime characterized by a very slow convergence rate, equal to about 20 The observed information matrix can be computed as the difference between the total information

matrix and the missing data information matrix. This matrix is then inverted to extract the square roots of the elements from the main diagonal, in order to obtain the standard errors of the parameters. The matrix dimension is given by the number of parameters minus one, because one of the weights is a linear combination of the others. 21 In some cases the attribution may be less precise (i.e. there are regions with a 100% probability of

belonging to a group, and regions with only a 51% probability). Generally, we find about 90% of regions that are attributed with a high difference with respect to the alternative regime. 22 To save space, we do not show the tables with four and five components. The criteria, however, do not

record any improvement with respect to the three components specification.

A spatially filtered mixture of β-convergence regressions for EU regions

117

Table 2 Decision criteria for the mixture’s components number Log-likelihood

AIC

MAIC

BIC

1 Component (OLS)

693

2 Components

730

−1,380

−1,377

−1,370

3 Components

736

−1,450

−1,439

−1,415

EU-15: 1980–2002

EU-15: 1980–1991 1 Component (OLS)

629

2 Components

660

3 Components

669

EU-15: 1991–2002 1 Component (OLS)

637

2 Components

653

3 Components

657

EU-25: 1991–2002 1 Component (OLS)

732

2 Components

750

3 Components

754

−1,446

−1,439

−1,423

−1,252

−1,249

−1,243

−1,316

−1,305

−1,281

−1,306

−1,299

−1,283

−1,268

−1,265

−1,258

−1,292

−1,281

−1,256

−1,293

−1,286

−1,270

−1,458

−1,455

−1,447

−1,487

−1,476

−1,448

−1,485

−1,478

−1,461

Data source Cambridge Econometrics, European Regional Database, 2004

0.5% per year. This regime does not show substantial differences across samples and periods. The half-lives, in all cases, exceed a century, being this way not particularly interesting in terms of cohesion. The three-components specification makes it clear that the slow convergence regime is constituted by a smaller group of fast convergence rates, ranging from about 1.2 to 4.7%, and a larger regime with slower or absent convergence.23 This latter comprises one half or two thirds of the regions in the samples, depending on the periods considered. The convergence regime shrinks during the Nineties, a decade that saw the fostering of the EU integration process due to the Maastricht Treaty and adhesion to EMU. Overall results, interestingly, seem to be in line with recent empirical investigations on the subject (see Meliciani and Peracchi 2006). As a final step, we proceed with a visual inspection of the regions, allocated by the mixture to the different regimes, net of spatial effects (Fig. 2). As an illustration, the EU-25 sample is shown. Regions in dark grey make up the faster convergence group, the light grey group is the slower convergence regime, while white indicates the regions which do not converge. The map depicts the existence of dualistic phenomena in many States. Rich and poor regions in Ireland, the UK, France, Germany, Spain, Italy, as well as in many of the New Member States, fall into opposite regimes that are not converging between themselves. Such phenomena reinforce the “diverging-converging” paradox (i.e. convergence between States, but not within). 23 In the case of EU-15, 1980–1991, the convergence rate remains similar to the two-components specifi-

cation, being equal to about 0.6% per year.

118

M. Battisti, G. Di Vaio

Table 3 Spatially filtered mixtures regressions 2 Components

3 Components

Regime 1

Regime 2

Regime 1

Regime 2

Regime 3

aˆ bˆ

0.324 (0.120)

0.062 (0.015)

0.340 (0.120)

0.183 (0.023)

0.037 (0.020)

0.032 (0.013)

0.005 (0.002)

0.034 (0.013)

0.018 (0.002)

0.002 (0.002)

Half-life (years)

5

131

11

30

339

Weight

12%

88%

12%

22%

66%

0.267 (0.108)

0.067 (0.021)

0.454 (0.048)

−0.072 (0.127)

0.075 (0.023)

EU-15: 1980–2002

EU-15: 1980–1991 aˆ bˆ

0.027 (0.012)

0.005 (0.002)

0.047 (0.005)

Half-life (years)

22

135

10

–

−0.010 (0.014)

112

0.006 (0.003)

Weight

26%

74%

15%

19%

66%

aˆ bˆ

−0.061 (0.145)

0.064 (0.025)

0.036 (0.030)

0.005 (0.003)

−0.161 (0.179)

0.255 (0.111)

−0.008 (0.015)

0.025 (0.012)

0.002 (0.003)

–

135

–

24

343

Weight

20%

80%

14%

24%

62%

EU-15: 1991–2002

Half-life (years)

−0.019 (0.02)

EU-25: 1991–20e02 aˆ bˆ

0.144 (0.063)

0.069 (0.016)

0.222 (0.085)

0.137 (0.032)

0.012 (0.024)

0.013 (0.007)

0.006 (0.002)

0.021 (0.009)

0.012 (0.003)

Half-life (years)

49

112

29

54

−0.000 (0.002)

–

Weight

23%

77%

16%

38%

46%

Standard errors within parentheses Data source Cambridge Econometrics, European Regional Database, 2004

6 Conclusions In this work we analysed regional convergence patterns, trying to avoid potential sources of problems due to spatial effects, parameter heterogeneity and outliers. The methodology adopted here shows that neither absolute, conditional, nor club convergence are the best hypotheses for explaining regional growth in Europe, over the period 1980–2002. Summarizing, a common specification for the whole sample is an assumption too much restrictive, convergence rates are far removed from the “empirical norm” of 2% per year, and spatial effects matter. Once the data have been spatially filtered, the mixture endogenously identifies multiple, a-spatial, growth regimes. In the case of a three-components specification, generally one regime behaves as an “outlier bin”, the other shows a sustained convergence rate, and the third, comprising the majority of the sample, shows no convergence at all. Many regions, whether inside “poor” or “rich” States, fall into the non convergence regime, where agglomeration factors, and increasing returns, might play a role. Such a mechanism reinforces the paradox of the so-called “diverging-convergence”,

A spatially filtered mixture of β-convergence regressions for EU regions

119

Fig. 2 Convergence patterns, net of spatial influence, EU-25: 1991–2002∗ . ∗ Dark grey fast convergence, light grey slow convergence, white no convergence

that is, the convergence between States but not within. Furthermore, a North–South division does not emerge, except in Italy’s case, while a core-periphery dynamic seems a more plausible scenario. Finally, since the main intent of this work was to take into account misspecification sources in the β-convergence framework, policy prescriptions cannot be easily drawn. However, some implications may be discussed. Since convergence rates do not vary very much between the two sub-periods, cohesion policies do not make a substantial difference. If anything, 1990s see an expansion of the non convergence area. Looking at the enlarged sample, regions belonging to the New Member States show different trends. In conclusion, over the period considered, regional growth dynamics does not seem to have followed a common pattern towards the convergence of per capita income across Europe.

References Abreu M, De Groot HLF, Florax RJGM (2005a) A meta-analysis of β-convergence: the legendary 2%. J Econ Surv 19:389–420 Abreu M, De Groot HLF, Florax RJGM (2005b) Space and growth: a survey of empirical evidence and methods. Région et Développement 21:13–44 Anselin L (1988) Spatial econometrics: methods and models. Kluwer, Dordrecht Anselin L (2001) Spatial econometrics. In: Baltagi BH (ed) A companion to Theoretical econometrics. Blackwell, Oxford, pp 310–330 Anselin L (2005) Exploring spatial data with GeoDa: a workbook. Center for spatial integrated social science, Santa Barbara

120

M. Battisti, G. Di Vaio

Anselin L, Bera AK (1998) Spatial dependence in linear regression models with an introduction to spatial econometrics. In: Ullah A, Giles DEA (eds) Handbook of applied economic statistics. Marcel Dekker, New York, pp 237–289 Arbia G (2006) Spatial econometrics: statistical foundations and applications to regional convergence. Springer, Berlin Armstrong HW (1995) Convergence among regions of the European Union, 1950–1990. Pap Reg Sci 74:143–152 Azariadis C, Drazen A (1990) Threshold externalities in economic development. Q J Econ 105:501–526 Badinger H, Müller WG, Tondl G (2004) Regional convergence in the European Union, 1985–1999: a spatial dynamic panel analysis. Reg Stud 38:241–253 Barro RJ, Sala-i-Martin X (1991) Convergence across states and regions. Brookings Pap Econ Act 1:107– 182 Barro RJ, Sala-i-Martin X (1992) Convergence. J Polit Econ 100:223–251 Barro RJ, Sala-i-Martin X (2004) Economic growth, 2nd edn. MIT, Cambridge Bernard AB, Durlauf SN (1996) Interpreting tests of the convergence hypothesis. J Econom 71:161–173 Bloom DE, Canning D, Sevilla J (2003) Geography and poverty traps. J Econ Growth 8:355–378 Boldrin M, Canova F (2001) Inequality and convergence in Europe’s regions: reconsidering European regional policies. Econ policy 16:207–253 Canova F (2004) Testing for convergence clubs in income per capita: a predictive density approach. Int Econ Rev 45:49–77 Commission of the European Communities (2005) Communication from the Commission. Third progress report on cohesion: towards a new partnership for growth, jobs and cohesion. 17.5.2005 COM(2005) 192 final. Brussels De Sarbo WS, Cron WL (1988) A maximum likelihood methodology for clusterwise linear regression. J Classif 5:249–282 Durlauf SN, Johnson PA (1995) Multiple regimes and cross-country growth behaviour. J Appl Econom 10:365–384 Durlauf SN, Quah DT (1999) The new empirics of economic growth. In: Taylor JB, Woodford M (eds) Handbook of macroeconomics, vol 1. North Holland, Amsterdam, pp 235–308 Durlauf SN, Kourtellos A, Minkin A (2001) The local Solow growth model. Eur Econ Rev 45:928–940 Durlauf SN, Johnson PA, Temple JRW (2005) Growth econometrics. In: Durlauf SN, Aghion P (eds) Handbook of economic growth. North Holland, Amsterdam, pp 555–677 Ertur C, Le Gallo J, Baumont C (2006) The European regional convergence process, 1980–1995: do spatial regimes and spatial dependence matter? Int Reg Sci Rev 29:3–24 European Commission (2001) Unity, solidarity, diversity for Europe, its people and its territory. Second report on economic and social cohesion. Office for Official Publications of the European Communities, Luxembourg European Commission (2004) A new partnership for cohesion. Convergence competitiveness cooperation. Third report on economic and social cohesion. Office for Official Publications of the European Communities, Luxembourg Funck B, Pizzati L (eds) (2003) European integration, regional policy, and growth. World Bank, Washington D.C., Galor O (1996) Convergence? Inferences from theoretical models. Econ J 106:1056–1069 Getis A (1995) Spatial filtering in a regression framework: examples using data on urban crime, regional inequality, and government expenditures. In: Anselin L, Florax R (eds) New directions in spatial econometrics. Springer, Berlin, pp 172–88 Getis A, Griffith DA (2002) Comparative spatial filtering in regression analysis. Geogr Anal 34:130–140 Hart PE (1995) Galtonian regression across countries and the convergence of productivity. Oxf Bull Econ Stat 57:287–293 Hawkins DS, Allen DM, Stromberg AJ (2001) Determining the number of components in mixture of linear models. Comput Stat Data Anal 38:15–48 Le Gallo J, Ertur C (2003) Exploratory spatial data analysis of the distribution of regional per capita GDP in Europe, 1980–1995. Pap Reg Sci 82:175–201 Le Gallo J, Dall’erba S (2006) Evaluating the temporal and spatial heterogeneity of the European convergence process, 1980–1999. J Reg Sci 46:269–288 López-Bazo E, Vayá E, Mora AJ, Suriñach J (1999) Regional economic dynamics and convergence in the European union. Ann Reg Sci 33:343–370

A spatially filtered mixture of β-convergence regressions for EU regions

121

Louis TA (1982) Finding the observed information matrix when using the EM algorithm. J R Stat Soc B 44:226–233 MacLachlan G, Peel D (2000) Finite mixture models. Wiley, New York Mankiw NG, Romer D, Weil DN (1992) A contribution to the empirics of economic growth. Q J Econ 107:407–437 Martin P (1998) Can regional policies affect growth and geography in Europe? World Econ 21:757–774 Meliciani V, Peracchi F (2006) Convergence in per-capita GDP across European regions: a reappraisal. Empir Econ 31:549–568 Neven D, Gouyette C (1995) Regional convergence in the European community. J Common Market Stud 33:47–65 Paap R, van Dijk HK (1998) Distribution and mobility of wealth of nations. Eur Econ Rev 42:1269–1293 Pagan A (1984) Econometric issues in the analysis of regressions with generated regressors. Int Econ Rev 25:221–247 Petrakos G, Rodríguez-Pose A, Rovolis A (2005) Growth, integration, and regional disparities in the European union. Environ Plann A 37:1837–1855 Quah DT (1993a) Empirical cross-section dynamics in economic growth. Eur Econ Rev 37:426–434 Quah DT (1993b) Galton’s fallacy and tests of the convergence hypothesis. Scand J Econ 95:427–443 Quah DT (1996a) Empirics for economic growth and convergence. Eur Econ Rev 40:1353–1375 Quah DT (1996b) Twin peaks: growth and convergence in models of distribution dynamics. Econ J 106:1045–1055 Quah DT (1996c) Regional convergence clusters across Europe. Eur Econ Rev 40:951–958 Quah DT (1997) Empirics for growth and distribution: stratification, polarization, and convergence clubs. J Econ Growth 2:27–59 Rey SJ, Montouri BD (1999) US regional income convergence: a spatial econometric perspective. Reg Stud 33:143–156 Sala-i-Martin X (1996a) The classical approach to convergence analysis. Econ J 106:1019–1036 Sala-i-Martin X (1996b) Regional cohesion: evidence and theories of regional growth and convergence. Eur Econ Rev 40:1325–1352 Solow RM (1999) Neoclassical growth theory. In: Taylor JB, Woodford M (eds) Handbook of macroeconomics, vol 1. North Holland, Amsterdam, pp 637–667 Temple JRW (1998) Robustness tests of the augmented Solow model. J Appl Econom 13:361–375 Temple JRW (2000) Growth regressions and what the textbooks don’t tell you. Bull Econ Res 52:181–205 Titterington DM, Smith AFM, Makov UE (1985) Statistical analysis of finite mixture distributions. Wiley, Chichester Tsionas EG (2000) Regional growth and convergence: evidence from the United States. Reg Stud 34:231– 238 Turner RT (2000) Estimating the propagation rate of a viral infection of potato plants via mixtures of regressions. Appl Stat 49:371–384 Wedel M, Kamakura WA (1998) Market segmentation: conceptual and methodological foundations. Kluwer, Dordrecht

Spatial shift-share analysis versus spatial filtering: an application to Spanish employment data Matías Mayor · Ana Jesús López

Abstract The aim of this work is to analyse the influence of spatial effects in the evolution of regional employment, thus improving the explanation of the existing differences. With this aim, two non-parametric techniques are proposed: spatial shift-share analysis and spatial filtering. Spatial shift-share models based on previously defined spatial weights matrix allow the identification and estimation of the spatial effects. Furthermore, spatial filtering techniques can be used in order to remove the effects of spatial correlation, thus allowing the decomposition of the employment variation into two components, respectively related to the spatial and structural effects. The application of both techniques to the spatial analysis of regional employment in Spain leads to some interesting findings and shows the main advantages and limitations of each of the considered procedures, together with the quantification of their sensitivity with regard to the considered weights matrix. Keywords Spatial autocorrelation · Spatial shift-share · Spatial filtering · Employment

1 Introduction Shift-share analysis is a statistical tool that allows the study of regional development by means of the identification of two types of factors. The first group of factors

M. Mayor (B) · A. J. López Department of Applied Economics, University of Oviedo, Campus del Cristo s/n, Oviedo, Asturias 33006, Spain e-mail: [email protected] A. J. López e-mail: [email protected]

124

M. Mayor, A. J. López

operates in a more or less uniform way throughout the territory under review, although the magnitude of their impact on the different regions varies with their productive structure. The second type of factors has a more specific character and operates at a regional level. Although according to Dunn (1960) the main objective of the shift-share technique is the quantification of geographical changes, the existence of spatial dependence and/or heterogeneity has barely been considered. The classical shift-share approach analyses the evolution of an economic magnitude between two periods by identifying three components: a national effect, a sectoral effect and a competitive effect. However, this methodology focuses on the dependence of the considered regions with respect to the national evolution but it does not take into account the interrelation among geographical units. The need to include the spatial interaction has been acknowledged by Hewings (1976) in his revision of shift-share models. In the classical formulation, this spatial influence is gathered in a certain way, since the local predictions should converge on the national aggregate. Nevertheless, at the same time the estimation of the magnitude of sector i in region j is supposed to be independent from the growth of the same sector in another region k, an assumption which would only make sense in the case of a self-sufficient economy. The increasing availability of data together with the development of spatial econometric techniques allows the incorporation of spatial effects into shift-share analysis. The aim is to obtain a competitive effect without spatial influence, allowing the differentiation between a common pattern in the neighbouring regions and an individual pattern of the specific region under scrutiny. In order to achieve this objective, we analyse the suitability of two different procedures in this work: the definition of a spatial weights matrix to be included into a shift-share model and the previous filtering of the considered variables. Both alternatives are applied to the Spanish regional employment provided by the Spanish Active Population Survey.

2 Shift-share analysis and spatial dependence The introduction of spatial dependence in a shift-share model can be carried out by two alternative methods. The first one is based on the modification of the classical identities of deterministic shift-share analysis by adding some new extensions, while the second is based on a regression model (stochastic shift-share analysis) and the inclusion of spatial substantive and/or residual dependence. According to Isard (1960), any spatial unit is affected by the positive and negative effects transmitted from its neighbouring regions. This idea is also expressed by Nazara and Hewings (2004), who assign great importance to spatial structure and its impact on growth. As a consequence, the effects identified in the shift-share analysis are not independent, since similarly structured regions can be considered in a sense to be “neighbouring regions” of a specified one, thus exercising some influence on the evolution of its economic magnitudes.

Spatial shift-share analysis versus spatial filtering

125

2.1 Classical shift-share analysis If we denote by X i j the initial value of the considered economic magnitude corres′ ponding to the i sector in the spatial unit j, X i j being the final value of the same magnitude, then the change undergone by this variable can be expressed as follows: ′ (1) X i j − X i j = X i j = X i j r + X i j (ri − r ) + X i j ri j − ri R ′ S R ′ ′ Xi j − Xi j j=1 X i j − X i j j=1 (X i j − X i j ) i=1 r= ri = ri j = S R R Xi j j=1 X i j i=1 j=1 X i j

The three terms of this identity correspond to the shift-share effects: National effect

NEi j = X i j r

Sectoral or structural effect SEi j = X i j (ri − r ) Regional or competitive effect CEi j = X i j ri j − ri

As it can be appreciated, besides the national growth we should consider the positive or negative contributions derived from each spatial environment, known as the net effect. Thus the sectoral effect collects the positive or negative influence on the growth of the specialisation of the productive activity in sectors with growth rates over or under the average, respectively. In its turn, the competitive effect collects the special dynamism of a sector in a region in comparison to the dynamism of the same sector at national level. Once the regional and sectoral effects are calculated for each industry, their sum provides a null result, a property which Loveridge and Selting (1998) call “zero national deviation”. Is spite of its limitations,1 the shift-share technique is widely used in the analysis of spatial dynamics. In order to solve one of the main drawbacks of this method, related to the fact that the sectoral and regional effects depend on the industrial structure, Esteban-Marquillas (1972) introduced the idea of “homothetic change”. This is defined as the value that would take on the magnitude of sector i in region j, if the sectoral structure of that region were assumed to be coincident with the national one. In this way, the homothetic change of sector i in region j is given by the expression: X i∗j

=

S

R

j=1 X i j X i j S R i=1 j=1 X i j i=1

S

R j=1 X i j R i=1 j=1 X i j j=1

= S

Xi j

(2)

1 Some limitations have been detected in shift-share analysis, derived, in the first place, from an arbitrary

choice of the weights, which are not updated with the changes of the productive structure. Secondly, the obtained results are sensitive to the degree of sectoral aggregation and, furthermore, the growth attributable to secondary multipliers is assigned to the competitive effect when it should be collected by the sectoral effect, resulting in the interdependence of both components. Besides these problems, some authors such as Dinc et al. (1998) emphasise the complexity related to the increasing of the spatial dependences between the sectors and the regions, which should be reflected in the model by means of the incorporation of some term of spatial interaction.

126

M. Mayor, A. J. López

leading to the following shift-share identity: X i j = X i j r + X i j (ri − r ) + X i∗j ri j − ri + X i j − X i∗j ri j − ri

(3)

The third element on the right hand side of the equation is known as the “net competitive effect”, which measures the advantage or disadvantage of each sector in the region with respect to the total. The part of growth not included in this effect when X i j = X i∗j is called the “locational effect”, corresponding to the last term of identity (3) and measuring the specialisation degree. 2.2 The structure of spatial dependence: spatial weights Since each region should not be considered as an independent reality, it would be advisable to develop a more complete version of the shift-share identity, bearing in mind that the economic structure of each spatial unit will depend on others, which are considered “neighbouring regions” in some sense. A suitable approach is the definition of a spatial weights matrix, thus solving the problems of multi-directionality of spatial dependence. The concept of spatial autocorrelation attributed to Cliff and Ord (1973) has been object of different definitions and, in a general sense, it implies the absence of independence among the observations, showing the existence of a functional relation between what happens at a spatial point and in the population as a whole. The existence of spatial autocorrelation can be expressed as follows: Cov X j , X k = E X j X k − E X j E(X k ) = 0

(4)

X j , X k being observations of the considered variables in units j and k, which could be measured in latitude and longitude, surface or any other spatial dimension. In the empirical application included in this paper, these spatial units are the European territorial units NUTS-III at the Spanish level. The spatial weights are collected in a squared, non-stochastic matrix, whose elements w jk show the intensity of interdependence between the spatial units j and k. 0 w12 ⎢ w21 0 W =⎢ ⎣ · · wN 1 wN 2 ⎡

⎤ · w1N · w2N ⎥ ⎥ · · ⎦ · 0

(5)

According to Anselin (1988), these effects should be finite and non-negative and they could be collected according to diverse options. A well-known alternative is the Boolean matrix, based on the criterion of physical contiguity and initially proposed by Moran (1948) and Geary (1954). These authors assume w jk = 1 if j and k are neighbouring units and w jk = 0 in another case, the elements of the main diagonal of this matrix being null.

Spatial shift-share analysis versus spatial filtering

127

In order to allow an easy interpretation, the weights are standardised by rows, so that they satisfy 0 ≤ w jk ≤ 1 and k w jk = 1 for each row j. According to this fact, the values of a spatial lag variable in a certain location are obtained as an average of the values in its neighbouring units.2 The consideration of different criteria for the development of the spatial weights matrix can deeply affect the empirical results. Thus, the contiguity can be defined according to a specific distance: w jk = 1 d jk ≤ δ, d jk being the distance between two spatial units and δ the maximum distance allowed so that both can be considered neighbouring units. Similarly, the weights proposed by Cliff and Ord (1973, 1981) depend on the length of the common border adjusted by the inverse distance between both locations: β

w jk =

b jk d αjk

(6)

b jk being the proportion that the common border of j and k represents with respect to the total j perimeter. From a more general perspective, weights should consider the potential interaction between units j and k and could be computed as: w jk = d −α jk and w jk = e−βd jk . In some cases the definition of weights is carried out according to of

the concept “economic distance” as defined by Case et al. (1993) with w jk = 1 X j − X k , X j and X k , being the per capita income or some related magnitude. Some other authors as López-Bazo et al. (1999) suggest the use of weights based on commercial relations.3 Some alternative definitions have been developed by Fingleton (2001), with 2 d −2 and Boarnet (1998),4 whose weights increase with the similarity wi j = G D Pt=0 ij between the investigated regions: 1 Xi − X j wi j = 1 j Xi − X j

(7)

The choice of the spatial weights matrix is a key step in the spatial econometric modelling and nowadays there is no single method to select an appropriate specification of this matrix. In fact, this problem is suggested for future research by Anselin et al. (2004), and Paelinck et al. (2004) among others. 2 Together with the advantages of simplicity and easy use, the considered matrix shows some limitations,

such as the non-inclusion of asymmetric relations, which is a requirement included in the five principles established by Paelinck and Klaassen (1979). 3 The consideration of a binary matrix with weights based only on distance measures guarantees exogeneity

but it can also affect the empirical results as indicated by López-Bazo et al. (2004). In this sense, it would be interesting to compare these results with those related to some alternative weights defined as a function of the economic variables of interest. 4 Boarnet (1998) defines a spatial weights matrix based on population density, per-capita income and the

sectoral structure of the employment in each region. The considered matrix is also standardised by rows, since its expression guarantees that the aggregation of the weights for each region leads to a unitary result.

128

M. Mayor, A. J. López

3 Models of spatial dependence The extension of the shift-share model proposed by Nazara and Hewings (2004) introduces the spatially modified growth rates according to the previously assigned spatial weights: ri j = r + rivj − r + ri j − rivj

(8)

where rivj is the rate of growth of the i sector in the neighbouring regions of a given spatial unit j, which can be obtained as follows:

rivj =

t+1 t − w X w X jk jk k∈v k∈v ik ik t w X k∈v jk ik

(9)

It must be noted that the w jk elements correspond to the previously defined matrix of rows-standardised weights. In any case, regional interactions are supposed to be constant between the considered periods of time, as it is usually assumed in spatial econometrics. Three components are considered in expression (8), the first one corresponding to the national effect, which is equivalent to the first effect of the classical (non-spatial) shift-share analysis. The second one, the sectoral effect or industry mix neighbouring regions-nation effect, shows a positive value when the evolution of the considered sector in the neighbouring regions of j is higher than the average. Finally, the third term is the competitive neighbouring regions effect and compares the rate of growth in region j of a given sector i with the evolution of the spatially modified sector. Thus, a negative value of this effect shows a regional evolution that is worse than the one registered in the neighbouring regions, meaning that region j fails to take advantage of the positive influence of its neighbouring regions. A weakness can be found in the previously defined model, since a single spatial weight matrix is considered for the computation of the different spatially modified rates of sectoral and global growth. This assumption would not be so problematic if we used the binary matrix, instead of endogenous matrices which would vary sensitively depending on the sectoral or global adopted perspective. Furthermore, the use of the same structure of weights in the initial and final periods could be considered too simplistic, suggesting the need for developing a dynamic version. Mayor and López (2005) develop an alternative approach in order to compute to what extent a spatial unit is being affected by the neighbouring territories. This procedure consists of introducing homothetic effects analogous to those defined by Esteban-Marquillas (1972) but referring to a regional environment. In this way, we would be able to define the value that the magnitude of sector i in region j would have taken if the sectoral structure of j were similar to its neighbouring regions. More specifically, the homothetic change with respect to the neighbouring regions would be given by the expression:

Spatial shift-share analysis versus spatial filtering

X ivj =

S i=1

129

X ik X ik S k∈V k∈V X ik i=1

(10)

A more complete option is based on the use of a spatial weights matrix. In this case the economic magnitude is defined as a function of the neighbouring values, and, therefore, the concept of homothetic employment would be substituted by spatially influenced employment, which would be computed according to a certain structure of spatial weights (W) and the effectively computed employment for each region-sector combination. The following identity would then hold: X i j = X i j r + X i j (ri − r ) + X iv∗j ri j − ri + X i j − X iv∗j ri j − ri

(11)

(12)

where the value of the magnitude is obtained from its neighbouring regions as: X iv∗j =

w jk X ik

k∈V

V being the set of neighbouring regions of j. One of the drawbacks of this spatially influenced employment is related to the fact that, as a consequence of the considered expression, it can be observed that: i, j X iv∗j = i, j X i j . This fact leads to two considerations with respect to the usefulness of the proposed definition: on the one hand, the extent of the effects for each sector-region is going to be in some cases sensitively different to that obtained in the equivalent model of Esteban-Marquillas (1972), leading to a more difficult interpretation and comparison of the obtained results. On the other hand, as a result of the structure of the spatial weights, the expected level of employment would be different to the effective one. In order to solve both problems, an alternative concept is proposed using new spatially modified sectoral weights based on spatially influenced employment (12): R v∗ j=1 X i j R v∗ i=1 j=1 X i j

S

ment:

=

X iv∗ X v∗ , leading to the so-called homothetic spatially influenced employ-

X iv∗∗ j = Xj

X iv∗ X v∗

(13)

= It must be emphasised that this new concept satisfies the identity i, j X iv∗∗ j i, j X i j , although substantial differences are found in the distribution of the variable for each combination sector-spatial unit. The substitution of the expression (13) in (11) leads to the identity: v∗∗ ri j − ri X i j r + X i j (ri − r ) + X iv∗∗ j ri j − ri + X i j − X i j

(14)

where the third expression is the spatial competitive net effect (SCNE**) and the fourth is the spatial locational effect (SLE**).

130

M. Mayor, A. J. López

4 Spatial filtering An alternative approach to deal with spatial autocorrelation in regression analysis involves the filtering of variables allowing the elimination of the spatial effects. The most well-known filtering procedures are those proposed by Getis (1990, 1995) based on the statistic of local association G i (Getis and Ord 1992 and Griffith’s 1996, 2000) alternative procedure based on the eigenfunction decomposition associated with the Moran statistic. According to Getis and Griffith (2002), both nonparametric spatial filtering approaches lead to similar solutions in the context of regression models but each procedure must be applied in the appropriate context and its different origin must be considered. Since one of the main problems in spatial regressions is related to the presence of stochastic regressors, leading to biased ordinary least squares (OLS) estimations, Getis (1990) develops a new procedure, based on the decomposition of a variable into two components (spatial and non-spatial) through the use of a filter or screen which removes the spatial component of each of the considered variables. In this paper we consider this screening procedure as a decomposition technique previous to further analyses. The spatial filtering developed by Getis (1990) is based on the consideration of a spatial vector S: S ≈ ρWX

(15)

which takes the place of both the spatial weights matrix and the auto-regressive coefficient ρ. We must point out that the S vector must be designed to capture the spatial dependence in the considered data. Its construction is based on data points, but in dealing with surface partitions, points could be considered as the references of different spatial areas and this vector allows the conversion of the dependent variable into its non-spatial equivalence: X∗ = X − S

(16)

Once the model includes all the non-spatial variables, it can be specified and estimated by the OLS method, leading to an unbiased estimation. In Getis (1990), S is found by means of the multistep second-order method developed by Ripley (1981). Haining (1982) asserts the importance of the second-order moment properties since one specific location on a map cannot be considered independent from other locations.5 Getis (1990) applied the local K -function L i (d) = A j=1,i= j ki j (d)/π (n − 1) as an association ratio and compares the expected and observed values for each individual observation, A being the regional size and 6 j=1,i= j ki j (d) the aggregation over all points located within distance d of point i. 5 This method makes use of the great deal of information provided by the set of distances between all pairs

of points. 6 Getis (1990) proposes as choice criteria of distance d the maximization of the expression 2 ˆ i L i (d) − L i (d) .

Spatial shift-share analysis versus spatial filtering

131

Although global tests such as Moran’s I and Geary’s c are generally used in a global context, a more detailed (local) detection of the spatial association is often required. Therefore, a modified version of the filtering procedure is developed by Getis (1995), based on the local statistic G i (d) by Getis and Ord (1992), which computes the degree of association due to the concentration of points within a distance d. Given a region divided into n subregions (which are considered as points with known values) G i (d) is the ratio between the sum of the x j values included in a d distance from the i point and the sum of the values in all the regions excluding i:

G i (d) =

n

j=1 wi j (d)x j n j=1 x j

; i = j

(17)

The G i (d) statistic requires the variable X to have a natural origin and to adopt a positive value. The matrix of spatial weights is binary, with wi j (d) = 1 if di j ≤ d and wi j (d) = 0 if di j > d. Getis and Ord (1992) deduce the expressions of the expected value and the variance under the spatial independence hypothesis:

wi j (d)

Wi (n − 1) (n − 1) Wi (n − 1 − Wi ) Yi2 Var(G i ) = 2 (n − 1)2 (n − 2) Yi1 E(G i ) =

xj

j

=

(18) (19)

x2

j j j 2. and Yi2 = n−1 − Yi1 with7 Yi1 = n−1 Expression G i (d) measures the concentration of the sum of values in the considered area, and would increase their result when high values of X are found within a distance d from i. In general terms, the null hypothesis is that the values within a distance d from i are a random sample drawn without replacement from the set of all possible values. Then, assuming that the statistic is normally distributed, the existence of spatial dependence can be tested from the following expression:

Zi =

G i (d) − E[G i (d)] √ Var(G i (d))

(20)

Getis (1995) proposes the computation of the filtering vector from the values of G i (d) statistic. Since the expected value of the Getis statistic, E(G i (d)) represents the value in location i when the spatial autocorrelation is absent, then the ratio G i (d)/E(G i (d)) is used in order to remove the spatial dependence included in the 7 As expected, the variance of this statistic would be null when no neighbouring regions exist (W = 0), i when all the n − 1 regions are neighbouring regions of i (Wi = n − 1) and also when values assigned to the n − 1 observations are coincident (Yi2 = 0).

132

M. Mayor, A. J. López

variable. If the considered statistic is higher than its expected value, then the spatial dependence results positive. In order to remove this spatial dependence from the considered variable we obtain the filtered series: xi x˜i =

Wi n−1

G i (d)

(21)

leading the difference between the original and the filtered series to a new variable, which shows the spatial dependence L = X − X˜ . According to Getis and Griffith (2002), two main ideas can be identified in the filtering procedure: firstly, it is necessary to identify a correct distance d to include the spatial dependence among the regions and secondly, the contribution to the spatial dependence of each individual observation should be computed. The main aim is to find an optimal value d which maximizes the existing spatial dependence. To accomplish this, Getis (1995) proposes maximizing the absolute value of the sum of the standard variation of statistic G i (d) for all the observations of X .

max

R k=1

|Z k | = max

R |G k (d) − E(G k (d))| √ Var(G k (d)) k=1

(22)

4.1 Spatial filtering models As previously mentioned, the filtering procedures can be considered a suitable tool when analysing spatial autocorrelation and trying to identify spatial effects. In this section we propose some useful models, analysing their main characteristics, advantages and limitations. Model 1: Once the filtering process is finished, a traditional shift-share analysis can be carried out considering both the spatial and non-spatial (filtered) component of the variables. The obtained results are not strictly comparable to those related to the original data for two different reasons: first, we must take into account that different filters are applied to the original and final periods, and second, the considered rates of growth are different in each case. Thus, the rates of growth for the filtered variable ( X˜ ) are: X˜ it j − X˜ it−k X˜ it − X˜ it−k X˜ t − X˜ t−k j r˜i j = r˜i = r˜ = t−k t−k t−k ˜ ˜ ˜ X Xi Xi j

(23)

leading to the following shift-share decomposition: X˜ i j = X˜ i j r˜ + X˜ i j (˜ri − r˜ ) + X˜ i j r˜i j − r˜i

(24)

Spatial shift-share analysis versus spatial filtering

133

Similarly, we can define the rates of growth for the spatial component X − X˜ = L, leading to the following decomposition: L i j = L i j r L + L i j riL − r L + L i j riLj − riL

(25)

where r L , riL , riLj are respectively the global, sectoral and regional-sectoral rates of growth. The described model leads to some interesting results although, as explained above, it is not comparable to the traditional shift-share and, therefore, the sum of spatial and non-spatial effects is not expected to coincide with that obtained in the classical identity applied to the original data. In fact, the coincidence is only verified by the national effect. Model 2: In this option, two new effects can be defined: the spatial competitive effect (SCE) and the non-spatial or filtered competitive effect (FCE). The proposal is similar, in some sense, to the Esteban-Marquillas decomposition. In this case, the homothetic employment is substituted by the expected level of the variable without spatial influences and the deviation between the expected and real values is due to the spatial spillover effects. The deviation, the spillover effect is measured in terms of a variable.8 Thus, the spatial competitive effect and the filtered competitive effect are given by the following expressions: SCEi j = L i j ri j − ri = X i j − X˜ i j ri j − ri FCEi j = X˜ i j ri j − ri

(26) (27)

It can be proved that this decomposition satisfies the additivity property, in a similar way to the original Esteban-Marquillas model. Thus, the filtered competitive effect is strictly comparable to that of the traditional shift-share, since the identity CE = FCE + SCE is held.

Model 3: A new option could be the comparison between the results obtained with filtered values and those obtained with the spatial shift-share developed by Mayor and Lopez (2005). We are trying to define an alternative concept to the homothetic change by EstebanMarquillas (10) and the homothetic spatially influenced variable by Mayor and López (2005) (13) by using a modified sectoral weight (without spatial spillovers) based on the values of the filtered variable. The spatially influenced variable (12) is substituted by the filtered value ( X˜ ): R

˜

j=1 X i j R ˜ i=1 j=1 X i j

S

=

X˜ i X

(28)

8 Although this new decomposition initially refers to the competitive effect it could also be extended to

other components (Arcelus 1984).

134

M. Mayor, A. J. López

Thus, the filtered homothetic employment based on the non-spatial component of the variable would be obtained as follows: X˜ i X˜ i∗∗j = X j X˜

(29)

leading to the following decomposition: X i j = X i j r + X i j (ri − r ) + X˜ i∗∗j ri j − ri + X i j − X˜ i∗∗j ri j − ri

(30)

from which two different effects can be identified: first, the filtered net competitive effect (FNCE) which describes the expected change in the variable assuming the national sectorial structure without spatial spillovers, and, second, the non-filtered locational effect (SLE) computing the difference between expected and real change of the variable due to the sectoral specialization of the region together with the spillover effects. In this case, it is verified that the sum of both effects leads to the same result as in the traditional shift-share. The computation of dynamic effects would be very interesting in order to obtain large series of spatial and non-spatial competitive effects, thus allowing their modelling and forecasting.

5 Some findings for the Spanish case Since the role of geographical space is essential in the understanding of the existing regional differences, several empirical studies have been carried out with the aim of analysing the disparities between Spanish provinces, focusing on different economic variables and using a wide variety of procedures. Thus, some of these studies deal with spatial convergence as Dall’erba (2005), while others study the spatial distribution of unemployment (as in López-Bazo et al. 2002). An alternative approach is proposed by Márquez and Hewings (2003), whose work analyses regional competition between Spanish regions (NUTS II) which could be considered as regional economies nested within a national system. The main property of this system is the interdependence among the Spanish provinces since the evolution of each region depends on the behaviour of the neighbouring regions. Adopting a dynamic point of view, Márquez et al. (2006) identify two macroeconomic effects in the Spanish regions: economy-wide effects (aspatial) and neighbourhood effects. The spatial filtering shift-share models proposed in Sect. 4 allow a similar treatment since the role of space is introduced by means of the decomposition of the competitive effect into two new effects: spatial and non-spatial. The main attractiveness of this method is its capacity to measure spatial contribution (spatial spillovers) to explain the evolution of the regional economy. The spatial filtering shift-share models allow the quantification of these spatial effects in terms of the economic variable.

Spatial shift-share analysis versus spatial filtering

135

Valencia

Sevilla

Zamora

Zaragoza

Vitoria

Toledo Teruel

Soria

Tarragona

Santander Segovia

Salamanca

San Sebastian

Palencia

Pamplona

Pontevedra

Oviedo

Orense

Valladolid

Murcia Málaga

Logroño Lleida

Lugo

Huesca

Jaén León

Granada

Huelva

Girona

Guadalajara

Córdoba

Cuenca Coruña (A)

Cádiz Burgos

Cáceres

Badajoz Bilbao

Ávila

Albacete

-25.0

Castellón de la Plana

25.0

Ciudad Real

Alicante

75.0

Almería

Madrid

125.0

-125.0

Barcelona

-75.0 Competitive Ef.

Fig. 1 Competitive effects of the evolution in employment in Spanish NUTS III

The existence of spatial shift-share models and the new filtering models proposed in this paper reflect the increasing importance of spatial externalities based on the new economic geography and the endogenous growth theories. The previously described models can be applied to the Spanish case, analysing the sectoral evolution of regional employment. More specifically, in this section we are focusing on the four main economic activities (agriculture, industry, construction and services) considering the European territorial units NUTS-III at a Spanish level leading to a total of 47 provinces.9 The information has been provided by the Spanish Economically Active Population Survey (EPA), whose methodology was modified in 2005 for three different reasons: the need to adapt to the new demographic and labour reality of Spain (due mainly to the increase in the number of foreign residents), the incorporation of new European regulations in accordance with the norms of the European Union Statistical Office (EUROSTAT) and the introduction of improvements in the information gathering method (changes in questionnaires and interviews carried out by Computer Assisted Telephone Interviewing—CATI-method). The shift-share analysis was carried out during the period 1999–2004 leading to some interesting findings related to sectoral and spatial patterns. The classic shift-share analysis shows positive sectoral effects in Construction and Services and negative effects in Agriculture and Industry. The competitive effect is aggregated by provinces in the Fig. 1. One way of measuring the weights of the sectoral and competitive effects in explaining the differences in regional evolution (the net change, X i j − X i j r ) is to compute the weight of the variance of each component in the overall observed variance together

9 In order to compare the results of the spatial filtering shift-share with the spatial shift-share based on a

contiguity matrix, Ceuta and Melilla and the Balearic and Canary Islands are excluded.

136

M. Mayor, A. J. López

Table 1 Results of Moran autocorrelation test Neighbour matrix

Distance

z value

z value

p value

p value

Growth rates of gross employment Agriculture Industry Construction

3.757

0.000

4.884

−0.534

0.593

−0.201

4.031

5.56E-05

5.501

1.04E-06 0.841 3.77E-08

Services

0.726

0.468

1.714

0.08654

Total

4.151

3.32E-05

5.409

6.33E-08

Growth rates of filtered employment Agriculture Industry Construction Services Total

−0.137

−0.613

0.891

1.893

0.058

0.539

−0.445

0.655

1.192

0.233

−0.159

0.873

0.504

0.614

0.764

−0.027

0.978

−0.301

1.311

0.190

with a term collecting the covariance between sectoral and competitive effects. Var(C E) Cov(S E, C E) Var(S E) = 0.05497 = 0.71417 2 = 0.23086 Var(C N ) Var(C N ) Var(C N ) These results show that the competitive effect and the covariance are much more important than the sectoral effect. As a consequence, further analysis of the competitive effect and the use of homothetic models would be necessary in order to avoid the interdependence between sectoral and competitive effects. The Moran test was carried out in order to detect spatial autocorrelation, leading to the conclusion that a slightly positive spatial autocorrelation exists among the Spanish provinces. More specifically, two different specifications of the spatial weights matrix have been considered in these tests: a binary exogenous matrix and a distance (km) based matrix. In the first case, the weights of the matrix are assumed to be 1 for neighbouring provinces and null in the remaining cases. Regarding the second option, the weights are obtained from the expression wi j = di−1 j . The results obtained for the Moran autocorrelation test are summarised in Table 1 and refer to the rate of growth in the considered period:10 As a first step, the filtering process has been carried out on the variables (levels of sectoral employment in agriculture, industry, construction and services in the 47 Spanish NUTS III). In each spatial unit, the local spatial autocorrelation statistic G i (d) (17) is evaluated at a series of increasing distances (10 km) together with its characteristics E(G i (d)) and V ar (G i (d)), according to the previously considered expressions. The 10 Longhi and Nijkamp (2005) use the Moran test to detect autocorrelation in employment levels and also

in the absolute and relative changes of employment.

Spatial shift-share analysis versus spatial filtering

137

Table 2 National, sectoral and competitive effects with non-filtered and filtered employment values NUTS

National effect Total

Filtered

Sectoral effect Ratio

Total

Filtered

−1.3

−2.0

−2.3

0.7

1.

Albacete

28.3

26.4

1.07

2.

Alicante

117.2

90.1

1.30

3.

Almería

40.6

37.7

1.08

4.

Ávila

11.9

15.4

0.78

5.

Badajoz

42.8

46.9

0.91

6.

Barcelona

433.0

480.5

0.90

7.

Bilbao

90.9

119.8

0.76

8.

Burgos

28.4

38.7

0.73

9.

Cáceres

27.2

32.1

0.85

Cádiz

67.5

86.6

10.

1.5

Competitive effect Ratio

Total

Filtered

0.66

−3.7

−7.0

0.53

39.3

1.21

38.3

−7.5

−5.12

−6.9

−38.6

0.18

1.61 −42.4

−96.7

0.44

−1.1

−1.33

−0.2

−1.1

0.17

−2.2

10.5

−1.2

−3.4

−3.40 0.36

47.5

−4.4

−0.9

−0.21 −112.7 −386.9

5.0

3.1

−1.6

−3.0

0.54

0.78

3.3

2.3

1.45

−3.4

−4.2

0.82

1.1

0.3

2.73

−3.6

−4.9

0.74

−5.4

−17.6

−2.1

−2.9

−0.37 −23.8

−3.5

0.4

−0.5

11.

Castellón

41.7

36.0

1.16

12.

Ciudad Real

32.9

37.1

0.89

13.

Córdoba

47.0

42.8

1.10

14.

Coruña (A)

85.9

151.1

0.57

15.

Cuenca

13.6

14.8

0.92

2.0

0.9

−5.7

0.8

0.9

15.3

16.

Girona

54.9

38.7

1.42

17.

Granada

50.1

42.3

1.19

18.

Guadalajara

12.9

15.3

0.84

0.0

19.

Huelva

29.5

37.8

0.78

20.

Huesca

17.0

16.9

1.00

−3.2

21.

Jaén

42.6

41.6

1.02

22.

León

35.8

46.0

0.78

23.

Lleida

32.4

31.1

1.04

24.

Logroño

22.6

24.4

0.93

25.

Lugo

29.9

33.6

0.89 −11.1 −10.1

−1.3

−6.6

−0.7

−1.0

−3.2

−0.7

−1.07

−0.88

4.77

0.29

−6.5

−12.2

0.53

3.0

−28.9

−0.10

2.16 −11.8

0.73

Ratio

13.7 2.4

18.8 19.6

−26.0

−19.4

0.46 −0.06

0.31

−11.4

−1.19

4.5

0.53

−16.7

6.83

−1.12

−26.9

−0.73

−0.9

−22.5

0.04

−7.0

0.94 −15.0

−27.5

0.55

−4.1

0.24

13.9

−0.43

10.7

−0.1

−0.34

−3.1

0.42

−1.5

0.46 −31.1

−4.9

0.65

−4.8

0.66

−2.7

−6.0

9.8

−9.3

−43.5

6.0

24.8

1.10 −10.9

70.7

1.09

0.29 0.71 0.24

−0.15

26.

Madrid

455.9

577.5

0.79

52.3

64.5

0.81

118.1

297.7

27.

Málaga

86.0

70.7

1.22

13.2

11.2

1.18

18.9

145.0

0.13

28.

Murcia

90.3

76.0

1.19

−6.0

−6.0

1.00

46.5

−5.6

−8.28

−2.5

−3.8

−4.1

−6.8

29.

Orense

25.4

28.8

0.88

30.

Oviedo

74.6

90.1

0.83

31.

Palencia

12.6

17.3

0.73

32.

Pamplona

49.6

51.3

0.97

33.

Pontevedra

71.8

71.1

1.01

34.

Salamanca

24.7

31.7

0.78

35.

San Sebastian

60.5

73.2

0.83

36.

Santander

39.1

48.4

0.81

37.

Segovia

13.0

16.4

0.80

38.

Sevilla

111.4

111.2

1.00

39.

Soria

8.2

9.3

0.88

−1.1

−1.0

−8.3

1.3

−2.6

0.3

−1.9

−9.5

1.9 3.2

−4.51 −22.4

50.5

0.65 −21.9

−38.6

0.60 −10.7

25.9

0.51

−3.3

−7.4

0.87 −16.9

−44.8

−0.81 −21.4

142.0

0.69

−5.3

−4.6

0.07

7.5

3.4

−0.4

−0.5

0.66

0.6

0.4

7.80

−4.0

−1.6

−2.6

0.61

−0.2

3.2

−3.1

40.7

228.7

−5.0

−3.2

0.40

−0.44

0.57 0.44

−0.41

0.38 1.15

−0.15

2.20

−6.34

0.18 1.57

138

M. Mayor, A. J. López

Table 2 continued NUTS

National effect Total

Filtered

Sectoral effect Ratio

Total

Competitive Effect

Filtered

Ratio

−1.5

−0.96

−4.1

0.40

40.

Tarragona

56.2

45.1

1.24

1.4

41.

Teruel

10.1

10.0

1.00

42.

Toledo

40.8

49.1

0.83

−1.3

−2.1

43.

Valencia

175.4

154.1

1.14

2.9

44.

Valladolid

42.2

56.8

0.74

0.0

−3.6

−0.80

45.

Vitoria

27.1

33.2

0.82

Zamora

11.9

14.3

0.83

−1.3

1.41

46.

−1.9

47.

Zaragoza

−3.6

−5.9

Total

74.2

72.4

1.02

2997.6

3291.9

0.91

−1.6

−1.3

−1.0

−1.5

0.61

0.03

0.88 0.62

Total

Filtered

Ratio

1.8

14.0

0.13

−2.9

−4.3

0.67

1.5

2.00

42.9

−84.8

−0.51

−21.3

0.34

3.0

−22.9 −7.3

−2.0

−10.5

−34.9 2.1

−22.0

0.66

−0.93

0.48

“optimal” distance11 was computed in order to obtain the filtered variable according to expression (21) for each sector in the initial and final periods. Once the filtering variables have been computed, the previously explained models are applied to ascertain the contributions of spatial effects to employment change, that is, the objective is to quantify the contribution of the spatial relationships to the provincial evolution in employment, focusing on the competitive effect due to the shift-share effect of collecting the special dynamism of each region. With respect to the first model, the effects obtained according to (24) (the filtered national, sectoral and competitive effects) are compared to those related to the original (non-filtered) variable in Table 2. The interpretation of these results stresses that the final effects of the spatial dependence are slightly negative, and therefore the elimination of the spatial effect would lead to an employment of 14451.86 versus 13671.75 with spatial effect in 1999. If filtered variables are considered, the change in the employment level in the period 1999–2004 would be 3291.95 instead of 2997.60, meaning that the aggregate spatial influence is negative in terms of employment. As a consequence we can conclude that the aggregated national effect of the filtered value would increase by 9% while the computation by provinces reflects the different spatial schemes. For example, Ávila, Barcelona, Bilbao, Burgos, Coruña and Madrid are some of the provinces with an employment filter level higher than the original value, that is, if spatial dependence is depurated (filtering process) then the employment level increases. On the other hand, Alicante, Girona, Tarragona and Valencia, for example, show a positive aggregate effect of spatial dependence. 11 In this case we have considered the “optimal” distance, selecting the distance by maximizing the spatial

dependence according to expression (22). More specifically, the selected distances for the employment levels in year 1999 are 425 km in Agriculture, 550 km in Industry; 450 km in construction and 550 km in Services, while the distances in year 2004 are 425 km in Agriculture, 550 km in Industry; 450 km in construction and 450 km in Services. With this approach, as distance increases from one point, the local statistics also increase if spatial autocorrelation is detected.

Spatial shift-share analysis versus spatial filtering

139

Table 3 Sectoral effect with filtered and non-filtered employment values

Total Filtered

Agriculture

Industry

Construction

−274.4

−370.6

295.1

349.8

281.8

470.6

−284.3

0.96

0.79

1.047

0.74

Madrid

Ratio

−468.1

Services

Filtered Competitive Ef. Non-Filtered Comp.Ef 125

Competitive Ef.

-125

Barcelona

-75

Huelva

Murcia

Alicante

Valencia

Sevilla

Almería

Málaga

Granada

Guadalajara

Girona

Córdoba

Santander

Logroño

Toledo

Cuenca Tarragona

Teruel

Zamora

Huesca

Albacete

Palencia

Ávila

Soria

Segovia

Lleida

Salamanca

Burgos

Ciudad Real

Badajoz

Vitoria

Zaragoza

Lugo

Pamplona

Jaén

Cáceres

Pontevedra

Oviedo

Orense

Valladolid

León

Coruña (A)

Bilbao

-25

San Sebastian

25

Cádiz

Castellón de la Plana

75

NUTS III

Fig. 2 Decomposition of the competitive effect into filtered competitive effect (FCE) and spatial competitive effect (SCE)

With regard to the sectoral and competitive effects, the analysis is more complex, since changes can be found both in the amount of the effects and in their signs. In fact, the observed variations are caused by two different factors: the considered variable (original or filtered employment) and the new filtered rates of growth. Sectoral effects are compared in Table 3. These results show positive interactions in most of the sectors (agriculture, industry and construction), services being the only activity with no significant positive spatial contribution and thus leading to a reduction of 26% in sectoral employment. As we have previously explained, the second model, with the consideration of a spatial competitive effect separated from the filtered competitive effect, has the advantage of being strictly comparable to the traditional competitive effect. A graphical representation is shown in Fig. 2, where it is observed that the filtered competitive effect (FCE) is not as important as the non-filtered competitive effect. Regarding the third model we must point out that the filtered net competitive effect (FNCE) reflects the variation in employment due to the advantages (disadvantages) of each sector in each different region when assuming a sectoral structure similar to the national one (homothetic), once the spillover effects have been discounted. On the other hand, spatial locational effect (SLE) measures the deviation with respect to

140

M. Mayor, A. J. López

Table 4 Comparison of the net competitive effects Agriculture Industry Construction Services 15.283

48.699

4.468

(2) Spatial net competitive E._Binary (M & L, 2005) 16.162

(1) Net competitive E. (E-M, 1972)

48.590

4.487

(2)/(1) (3) Spatial net competitive E._Boarnet (3)/(1) (4) Filter net competitive E. (4)/(1)

1.057

0.998

1.004

15.721

45.694

4.475

1.029

0.938

1.002

14.931

48.056

4.459

0.977

0.987

0.998

−24.259

−24.094

0.993

−24.683

1.017

−24.444

1.008

E-M Esteban-Marquillas (1972), M&L 2005 Mayor and López (2005)

the previous hypothesis due to spatial effects and the mobility of the labour market in response to comparative advantages. We compare the FNCE and the spatial net competitive effect SNCE** based on the homothetic spatial employment with the net competitive effect (NCE, EstebanMarquillas 1972) where the spatial effects are not considered. Table 4 summarizes the results of these comparisons by sectors. First, FNCE shows a positive spatial influence in sectoral employment and second, the SNCE** shows a positive spatial influence in agriculture, construction and services with a binary spatial weight matrix. The results are different if an endogenous matrix is considered (Boarnet matrix); in this case, the services sector suffers a negative spatial influence in terms of employment. The results of the FNCE are coincident with the SNCE** with a binary specification, with the exception of industrial employment. It must be pointed out that the new effects related to model 3 and those associated with the approach by Mayor and López (2005) have different interpretations, since the first one refers to a local spatial dependence while the second one responds to a more general perspective. The spatial locational effect (SLE) shows changes in its value for each combination sector-region but with certain stability, as can be observed in Fig. 3. The SLE is lower than the FNCE with the exception of Castellón, Cuenca, Lleida, Lugo y Zamora. 6 Concluding remarks In this paper we have analysed the influence of spatial effects in the evolution of regional employment, in order to improve the explanation of the existing differences. The proposed method considers each sector separately, thus allowing changes in the sectoral structure and also between the initial and final periods. From the conceptual point of view, this approach assumes that the considered value is the result of spatial and non-spatial relations. The advantage of the proposed models is the possibility of measuring the spatial spillovers for each region in terms of employment. Time series of these new effects could be obtained and modelled by means of dynamic shift-share analysis in order to obtain the corresponding future values.

Spatial shift-share analysis versus spatial filtering

141

Madrid

Spatial Locational Ef. Filtered Net Comp. Ef. 120

Alicante

Murcia

Valencia

Sevilla

Almería

Granada

Málaga

Girona

Córdoba

Guadalajara

Santander

Logroño

Toledo

Cádiz

Huelva

Teruel

Zamora

Huesca

Palencia

Albacete

Ávila

Soria

Segovia

Salamanca

Lleida

Burgos

Ciudad Real

Vitoria

Badajoz

Zaragoza

Lugo

Pamplona

Jaén

Cáceres

Oviedo

Pontevedra

San Sebastian

Orense

Valladolid

León

Bilbao

-30

Coruña (A)

20

Cuenca

Castellón de la Plana

70

Tarragona

Competitive ef.

-130

Barcelona

-80

Spanish NUTS III

Fig. 3 Decomposition of the competitive effect between filtered net competitive effect (FNCE) and the spatial locational effect (SLE)

One of the main problems of the Getis spatial filtering processes consists of the choice of the optimal distance in order to obtain the filter, since the obtained results are sensitive to the different considered screens. The underlying idea that the intensity relation is reduced by the distance is not always true and, therefore, the definition of other spatial weights based not only on distance could be a suitable solution. In this sense, Patuelli et al. (2006) and Tiefelsdorf and Griffith (2007) consider a spatial weight matrix based on patterns of commuting flows with the aim of including the hierarchical as well as the contiguity effects. The consideration of spatial relations and the definition of hierarchical ordinations among spatial units could be of a great interest from both statistical and economic perspectives. Acknowledgments The authors are grateful to the editors B. H. Baltagi and G. Arbia and two anonymous referees for their helpful comments. This paper was presented at the International Workshop on Spatial Econometrics and Statistics, Rome, Italy, May 25–27, 2006. We are thankful to all the participants for their useful comments and suggestions.

References Anselin L (1988) Spatial econometrics methods and models. Kluwer, Dordrecht Anselin L, Florax RJGM, Rey SJ (2004) Econometrics for spatial models recent advances. In: Anselin L, Florax RJGM, Rey SJ (eds) Advances in spatial econometrics. Springer, Berlin, pp 1–25 Arcelus FJ (1984) An extension of shift-share analysis. Growth Change 15:3–8 Boarnet MG (1998) Spillovers and the locational effects of public infrastructure. J Reg Sci 38:381–400 Case AC, Rosen HS, Hines JR (1993) Budget spillovers and fiscal policy interdependence evidence from the states. J Public Econ 52:285–307 Cliff AD, Ord JK (1973) Spatial autocorrelation. Pion, London Cliff AD, Ord JK (1981) Spatial processes models and applications. Pion, London Dall’erba S (2005) Productivity convergence and spatial dependence among Spanish regions. J Geogr Syst 7:207–227

142

M. Mayor, A. J. López

Dinc M, Haynes KE, Qiangsheng L (1998) A comparative evaluation of shift-share models and their extensions. Australas J Reg Stud 4:275–302 Dunn ES (1960) A statistical and analytical technique for regional analysis. Pap Reg Sci Assoc 6:97–112 Esteban-Marquillas JM (1972) Shift and Share analysis revisited. Reg Urban Econ 2:249–261 Fingleton B (2001) Equilibrium and Economic Growth Spatial Econometric Models and Simulations. J Reg Sci 41:117–147 Geary R (1954) The contiguity ratio and statistical mapping. Inc Stat 5:115–145 Getis A (1990) Seening for spatial dependence in regression analysis. Pap Reg Sci Assoc 69:69–81 Getis A (1995) Spatial filtering in a regression framework experiments on regional inequality government expenditures and urban ime. In: Anselin L, Florax R (eds) New directions in spatial econometrics. Springer, Berlin, pp 172–188 Getis A, Ord JK (1992) The analysis of spatial association by use of distance statistics. Geogr Anal 24:189– 206 Getis A, Griffith DA (2002) Comparative spatial filtering analysis. Geogr Anal 34:130–140 Griffith DA (1996) Spatial autocorrelation and eigenfunctions of the geographic weights matrix accompanying geo-referenced data. Can Geogr 40:351–367 Griffith DA (2000) A linear regression solution to the spatial autocorrelation problem. J Geogr Syst 2:141– 156 Haining R (1982) Describing and modelling rural settlement maps. Ann Assoc Am Geogr 72:211–223 Hewings GJD (1976) On the accuracy of alternative models for stepping-down multi-county employment projections to counties. Econ Geogr 52:206–217 Isard W (1960) Methods of regional analysis: an introduction to regional science. MIT, Cambridge Longhi S, Nijkamp P (2005) Forecasting regional labour market developments under spatial heterogeneity and spatial autocorrelation. Paper prepared for the Kiel Workshop on Spatial Econometrics López-Bazo E, del Barrio T, Artis M (2002) The regional distribution of Spanish unemployment: A spatial analysis. Pap Reg Sci 81:365–389 López-Bazo E, Vayá E, Artís M (2004) Regional externalities and growth evidence from European regions. J Reg Sci 44:43–73 López-Bazo E, Vayá E, Mora AJ, Suriñach J (1999) Regional economic dynamics and convergence in the European Union. Ann Reg Sci 33:343–370 Loveridge S, Selting AC (1998) A review and comparison of shift-share identities. Int Reg Sci Rev 21:37–58 Márquez AM, Hewings GJD (2003) Geographical competition between regional economies: The case of Spain. Ann Reg Sci 37:559–580 Márquez AM, Ramajo J, Hewings GJD (2006) Dynamic effects within a regional system: an empirical approach. Environ Plan A 38:771–732 Mayor M, López AJ (2005) The spatial shift-share analysis new developments and some findings for the Spanish case. In: Proceedings of the European Regional Science Association ERSA 2005, Amsterdam Moran P (1948) The interpretation of statistical maps. J R Stat Soc B 10:243–251 Nazara S, Hewings GJD (2004) Spatial structure and taxonomy of decomposition in shift-share analysis. Growth Change 35:476–490 Paelinck JHP, Klaassen LH (1979) Spatial econometrics. Saxon House Paelinck JHP, Mur J, Trívez J (2004) Spatial econometrics more lights than shadows. Estud Econ Aplic 22:1–19 Patuelli R, Griffith DA, Tiefelsdorf M, Nijkamp P (2006) The use of spatial filtering techniques: the spatial and space-time structure of German unemployment data. Tinbergen Institute Discussion Papers 06-049/3 Ripley B (1981) Spatial statistics. Wiley, New York Tiefelsdorf M, Griffith DA (2007) Semi-parametric Filtering of Spatial Auotocorrelation: The Eigenvector Approach. Environ Plann A 39:1193–1221

R&D spillovers and firms’ performance in Italy Evidence from a flexible production function Francesco Aiello · Paola Cardamone

Abstract Using a translog production function we estimate the impact of R&D spillovers on the output performance of Italian manufacturing firms over the period 1998-2003. Technological flows are measured through an asymmetric similarity index that takes also into account the geographical proximity of firms. Results show that R&D spillovers positively affect firms production and that geography matters in determining the role of the external technology. Moreover, we find that the effect of R&D spillovers is high in the Centre-South of Italy and that the stock of R&D spillovers is Morishima complement to the stock of R&D own-capital. Keywords R&D spillovers · Translog production function · Italian manufacturing firms JEL Classification O33 · L29 · C23

The authors thank Giovanni Anania, Olof Ejermo, Vincenzo Scoppa, Alessandro Sterlacchini and Marco Vivarelli for useful comments on an earlier draft. We are also grateful to the participants at the Workshop on “Spatial Econometrics and Statistics” in Rome (University “Guido Carli, May 2006) and at the 2006 ADRES Conference, “Networks of Innovation and Spatial Analysis of Knowledge Diffusion” in St Etienne for helpful discussion and to an anonymous referee for many detailed and constructive comments on an earlier version. All remaining errors and omissions are our own. Financial support received by MIUR is gratefully acknowledged. F. Aiello (B) · P. Cardamone Department of Economics and Statistics, University of Calabria, 87036 Rende, CS, Italy e-mail: [email protected] P. Cardamone e-mail: [email protected]

144

F. Aiello, P. Cardamone

1 Introduction The effects of R&D spillovers have been strongly emphasized in recent theoretical and empirical studies. The literature surveys by Griliches (1991), Breschi and Lissoni (2001) and Wieser (2005) identify many papers that found R&D spillovers to have a positive impact on production. However, the empirical literature is very scant as regarding the evaluation of the impact of R&D spillovers at firm level. The studies1 using this level of data aggregation consider the stock of indirect R&D capital as an augmenting variable of a Cobb–Douglas production function and measure the external technology through other firms’ stock of R&D capital. This paper departs from previous literature on two points. Firstly, we use a flexible production function in the belief that the issues relating to technological transfers across firms may be properly understood using this specification rather than the Cobb–Douglas form. Indeed, the Cobb–Douglas is somewhat restrictive in that it requires the elasticity of substitution between factors to be unity. On the contrary, the translog is a generalization of the Cobb–Douglas and relaxing this restriction allows the understanding of whether external technology is complementary to or a substitute for private inputs (i.e., labour, physical, human and technological capital). Secondly, we pay special attention to the measurement of R&D spillovers. If, on one hand, we follow the literature (see Griliches 1979) in determining the indirect stock of technological capital by the current and past investments in R&D made by other firms, on the other hand, we propose some refining procedures to model how, and to what extent, R&D capital spillovers flow across firms. We share with the literature the hypothesis that firms are not able to absorb all the external technology. As a consequence, we calculate the R&D spillovers as the weighted sum of the indirect stock R&D capital, but the method used to weight the innovation flows differs from that used by others in many aspects. Indeed, our weighting system for the external stock of R&D capital is based on the use of an index of similarity for each pair of firms. The hypothesis is that the more similar two firms are, the greater the flow of innovation between them (Jaffe 1986, 1988; Cincera 2005). As an index of similarity we use the uncentered correlation metric calculated by considering a set of firm-specific variables which defines the technological space. Furthermore, we introduce two improvements with regards the micro-econometric applications of the index of similarity. As is known, the uncentered correlation metric yields a symmetric matrix of similarity. As the assumption of symmetry is restrictive, because direction matters in determining technological transfers from one firm to another, we propose the use of an asymmetric matrix of similarity. To this end, we proceed to transform the uncentered correlation metric by using the differences in human capital within each pair of firms. A further original element is related to geographical proximity as a key determinant of the flows of innovation. In theory, it is widely agreed that spatial agglomeration is positively correlated to the diffusion of technology (Marshall 1920; Jacobs 1969; Romer 1986; Arrow 1962; Koo 2005; 1 Aiello and Pupo (2004), Aiello et al. (2005), Aiello and Cardamone (2005), Cincera (2005), Jaffe (1986),

Jaffe (1988), Los and Verspagen (2000), Medda and Piga (2004), Raut (1995), Wakelin (2001).

R&D spillovers and firms’ performance in Italy

145

Audretsch and Feldman 2004). However, it is worth noting that, in Italy, the spatial dimension of the flow of knowledge has been disregarded by all the existing papers analysing the impact of R&D spillovers. Therefore, we attempt to fill this gap and test the hypothesis that the closer two firms are, the more they will benefit each other’ R&D. This is done through a spatial weighting scheme based on the great circle distance. The empirical analysis is conducted on a panel of 1,203 manufacturing firms for the period 1998–2003. Data are obtained from the eighth and ninth “Indagine sulle imprese manifatturiere” (Survey of Manufacturing Companies) by Capitalia (2002, 2005). In the econometric setting, we refer to a system of equations derived from the translog production function (Christensen et al. 1973) and control for selection bias that zero values in R&D investments pose when the production function is log-linearized. This is done by modelling the decision to invest, or not, in R&D and then by estimating the production function with the 3SLS estimator. The paper arrives at a number of interesting results on the impact of R&D spillovers on Italian firms’ performance. Our evidence reveals that the contribution of R&D spillovers to firms’ production is positive, whatever the weighting system of knowledge transmission and the sample of firms we analyse (full sample or sub-sample of firms according to their geographical localization), and higher than that estimated for the internal stock of R&D capital. Furthermore, it emerges that geographical proximity matters in determining the final result: our regressions which disregard geography in the process of technology diffusion underestimate the role of R&D spillovers. Finally, we find that the internal and external stocks of R&D capital are Morishima complements. The remainder of the paper is organized as follows. Section 2 presents the production function specification and the estimation method adopted. Section 3 describes the procedures used to determine the R&D spillovers. Section 4 illustrates the data used, while Sect. 5 reports the econometric results. Section 6 concludes. 2 Empirical setting 2.1 Production function specification This section describes the production function used to estimate the impact of technological spillovers on firms’ production. Differently from the related literature, we consider a translog production function because of its higher flexibility with respect to the Cobb–Douglas specification. For the purpose of this paper, the trascendental logarithmic production function is expressed as follows: ln Yit = αi + α L ln L it + α K ln K it + αCt ln C Tit + α Sp ln Spillit 1 1 1 + β L L (ln L it )2 + β K K (ln K it )2 + βCC (ln C Tit )2 2 2 2 1 2 + β SpSp (ln Spillit ) + β L K ln L it ln K it + β LC ln L it ln C Tit 2 + β L Sp ln L it ln Spillit + β K C ln K it ln C Tit + + β K Sp ln K it ln Spillit + βC S p ln C Tit ln Spillit + εit

(1)

146

F. Aiello, P. Cardamone

where Y is the output gauged by the value added, L indicates the number of employees, K denotes the physical capital measured by the book value of total assets, CT is the technological capital determined by applying the perpetual inventory method to R&D investments (depreciation rate is assumed to be 15%) and Spill is the stock of R&D spillovers every firm faces; subscript i refers to firms, t is time and ε is a white noise. Following Berndt and Christensen (1973) and May and Denny (1979), we estimate the Eq. (1) together with a set of cost-share equations. This is because the resulting system of equations allows the using of additional information without increasing the number of parameters to be estimated (Antonioli et al. 2000). Furthermore, this procedure improves the efficiency of estimations and limits the multicollinearity bias in Eq. (1) (Feser 2004; Lall et al. 2001; Goel 2002). The cost share equations are specified as follows. Denoting with SL , S K , SC T , SS P the cost shares of labour, physical capital, R&D capital and R&D spillovers, respectively, under the assumption of constant returns to scale,2 we obtain: SL ,it = α L +β L L ln L it +β L K ln K it +β LC ln C Tit +β L Sp ln Spillit +u L ,it S K ,it = α K +β L K ln L it +β K K ln K it +β K C ln C Tit +β K Sp ln Spillit +u K ,it SC T,it = αC +β LC ln L it +β K C ln K it +βCC ln C Tit +βC Sp ln Spillit +u C T,it SS P,it = α Sp +β L Sp ln L it +β K Sp ln K it +βC Sp ln C Tit +β SpSp ln Spillit +u Sp,it

(2) (3) (4) (5)

Since the sum of input cost shares must be equal to one, all the parameters have been estimated by considering the system composed by Eqs. (1), (2), (3) and (4). This choice allows us to solve the problems one would encounter in estimating Eq. (5). These difficulties are due to the fact that the costs of R&D spillovers are not observable (cfr note 9).3 2.2 Estimation method There are two major econometric issues when estimating a production function like Eq. (1). The first issue is related to the endogeneity of regressors, whereas the second one regards the sample selection problem due to the log-linearization of the production function. We address these issues as follows. The system of Eqs. (1–4) is estimated through the 3SLS estimator. We consider as instruments the 1-year lagged value of all the regressors of Eq. (1), except for lnSpill and lnSpill2 , which are assumed to be exogenously determined at firm level. The Hausman test supports this choice. Furthermore, the log-linearization of a production function creates a sample selection problem, arising from the fact that the stock of 2 Constant returns to scale imply that α = 1 and β = 0. i i j ij 3 The labour cost share S is the total labour cost to the value added. Following Verspagen (1995), we L compute S K and SC as [PI (δ + r )]Z /Y where PI is the investment price deflator, δ is the rate of depre-

ciation assumed to be equal to 5% for physical capital and 15% for technological capital, r is the interest rate, which is assumed to be 5%, Z is the stock of capital (physical or technological) and Y is firms’ output, measured as the value added.

R&D spillovers and firms’ performance in Italy

147

R&D capital is constructed using R&D investments and that, in many cases, firms do not invest in R&D (zero-investment-values). Hence, we have a sub-sample of R&D performing firms (with positive values for R&D capital) and a sub-sample of nonR&D performing firms (with zero values for R&D capital), but the log-linearization of Eq. (1) restricts the sample to the R&D performing firms. In so doing, the loglinearization makes the sample no longer random, because it rules out the underlying process that leads each firm to invest, or not, in R&D. Consequently, there might be a selection problem due to the likely correlation between the decision to invest in R&D and the production function. As the selection occurs for the stock of R&D capital, which is a regressor, we address this issue using a treatment effect model and implementing a two-step instrumental variable procedure (Wooldridge 2002): in the first step we consider a probit model to explain the decision to invest in R&D, whereas, in the second step, we estimate the translog production function using as an instrument the fitted probabilities derived from the first step. The dependent variable of the probit model is unity if the ith firm invests in R&D and is zero if it does not. The regressors are the explanatory variables of the production function (Eq. 1), plus the key determinants of innovative efforts that we select following the related literature (Leo 2003; Becker and Pain 2003; Gustavsson and Poldhal 2003; Bhattacharya and Bloch 2004).4 From the probit model, we get the fitted probabilities (Gˆ it ) that enter in Eq. (1) as an instrument. This procedure allows us to run the Eq. (1) for the R&D-performing firms only and is suitable for two main reasons. First of all, the usual standard errors and test statistics are asymptotically valid and, secondly, no particular specification of the probit model has to be set up, because of using the variable Gˆ it as an instrument (Wooldridge 2002).5 3 The stock of R&D spillovers If the external technology available to a firm is related to the R&D investments of other firms (Griliches 1979) then a first measure of R&D spillovers will be the unweighted sum of the R&D stocks of the other n − 1 entities. This measure has two caveats: it assumes that (a) all the external technology is relevant for a firm and (b) the capacity to absorb technology does not differ across firms and sectors. In line with the argument that not all the investment efforts made by others are relevant for a firm (Griliches 1991), many papers agree that the measure of technological spillovers must 4 The variables we consider as determinants of the decision to invest in R&D are the following: human

capital, cash flow, investments in ICT, a dummy equal to unity if firm i exports and a set of dummies referring to the geographical location and the economic sector of each firm. The cash flow variable is computed as gross profits minus taxes plus depreciation. The ICT variable is the sum of hardware, software and telecommunication investments. Human capital is expressed by exp(ϕ R Sh) where Sh is the average number of years of schooling (8 for primary and middle school, 13 for high school and 18 for bachelor degree) and ϕ R is the regional rate of returns on education drawn from Ciccone (2004). 5 Indicating with w the treatment indicator, which is equal to 1 if there is treatment and 0 otherwise, and with G(x, z, γ ∗ ) the probit specification, “what we need is that the linear projection of w onto [x, G(x, z, γ ∗ )] actually depends on G(x, z, γ ∗ ), where we use γ ∗ to denote the plim of the maximum likelihood estimator

when the model is mis-specified [. . .] These requirements are fairly weak when z is partially correlated with w” (Wooldridge, 2002, p. 624).

148

F. Aiello, P. Cardamone

be a weighted sum of R&D capital stock of other firms. However, no consensus is achieved on the weighting system to be used. The most commonly used methods are based on either patents data (Jaffe 1986, 1988; Los and Verspagen 2000; Cincera 2005) or input-output matrices (Wakelin 2001; Medda and Piga 2004; Aiello and Pupo 2004; Aiello et al. 2005; Aiello and Cardamone 2005). Due to the fact that the I/O and the patent data are subject to criticism,6 we propose the use of a weighting system of external R&D capital based on firms’ technological similarity and/or geographical proximity.

3.1 Firms’ technological similarity The understanding of the firm’s position in a technological space helps to determine its technological opportunities, that is the amount of technological resources available for each entity (Cohen and Levinthal 1990). Furthermore, technological opportunities affect the absorptive capacity to use new technology and to be innovative (Cohen and Levinthal 1989). Many authors (Jaffe 1986, 1988; Griliches 1979, 1991; Cincera 2005; Harhoff 2000; Inkmann and Pohlmeier 1995; Kaiser 2002) agree that absorptive capacity depends on technological proximity: the closer two firms are in the technological space, the more they benefit from each other’s research efforts. From an empirical perspective, the questions to deal with before measuring the similarity between firms regard the choice of the variables which define the technological space, and the index of similarity to be used. Several authors (Jaffe 1986, 1988; Los and Verspagen 2000; Cincera 2005) argue that patent data allow the proper definition of an innovative space, others use investments in R&D (Harhoff 2000; Adams and Jaffe 1996), while Inkmann and Pohlmeier (1995) consider a set of firm specific characteristics (size, demand expectations, industry affiliation). As for the calculation of firms’ similarity, many studies (Jaffe 1986, 1988; Cincera 2005; Kaiser 2002; Los and Verspagen 2000) propose the uncentered correlation metric because, with respect to the Euclidian distance, this is not sensitive to the length of the vector which comprises the variables defining the technological space. We proceed by using the uncentered correlation metric which is expressed as follows: Xit X′jt ωi jt = 1/2 Xit Xit′ X jt X′jt

(6)

6 The use of patents to measure the flows of knowledge runs into the same problem encountered when

patents are used as an indicator of firms’ innovative activities, that is “not all inventions are patentable, not all inventions are patented” (Griliches, 1990, p. 1669). The I/O approach does not properly gauge the real magnitude of pure knowledge spillovers, as it is related to the flows of goods and services rather than to purely technological flows. Moreover, both I/O and patent matrices are generally available at industry level only and, consequently, their use at firm level (see, i.e., Los and Verspagen 2000; Aiello and Pupo 2004; Aiello et al. 2005; Medda and Piga 2004) requires that (a) the absorption of technology is constant across the firms in a sector (b) the extra-industry technology is the same for all firms belonging to the same industry.

R&D spillovers and firms’ performance in Italy

149

where X is a set of variables defining the technological space of firms and t is time (1998–2003).7 The index ωi jt ranges from zero to one. It is zero when firm i and firm j are not related at all, while it is unity if the k-variables in X it and X jt are identical.8 Equation (6) yields a symmetric matrix of weights. This means that at time t the intensity of the technological flows from firm i to firm j is equal to that observed from firm j to firm i. This property of the index ωi jt contrasts with the evidence that direction matters in determining how technology circulates from one firm to another. Therefore, we attempt to overcome the unrealistic assumption of symmetry by using the following transformation:

ωˆ i jt

ωˆ jit

Xit X′jt = 1/2 X jt X′jt Xit Xit′

Xit X′jt = 1/2 X jt X′jt Xit Xit′

h it max(h it , h jt )

h jt max(h it , h jt )

(7)

where the variable h is a measure of human capital, as defined in note four. The idea to make the uncentered correlation asymmetric (ωˆ i jt = ωˆ jit ) by using a measure of human capital relies on the fact that the firm’s absorptive capacity is strongly dependent on the quality of its human capital [see, among many others, Lucas (1988), Vinding (2006) and Wang (2007)]. 3.2 Firms’ geographical proximity It is commonly agreed that the flows of innovation depend not only on the technological but also on the geographical distance between firms: face-to-face contacts enhance knowledge spillovers whatever the technological similarity. Although a huge number of papers deals with the theoretical issues of the nexus between spatial agglomeration and knowledge spillovers (Marshall 1920; Jacobs 1969; Romer 1986; Arrow 1962; Koo 2005; Audretsch and Feldman 2004), the empirical analyses estimating the extent to which geography affects the diffusion of technology at firm level are very limited. The few exceptions are the studies by Adams and Jaffe (1996), Orlando (2000) and 7 In the prevailing literature, the index of similarity is determined using only one variable (the investments

in R&D or the sectoral patent data). This choice appears to be a strong constraint, because it is a very partial way of gauging the firm’s capacity to absorb external technology (two firms may be similar in terms of R&D investments, but their “absorptive capacity” may be limited because of other factors, one of which might be, e.g., the availability of human capital). Moreover, our index of similarity differs at firm-pair level and this allows us to overcome the strict assumption that the firms operating in a given sector have the same absorptive capacity. 8 The variables used to construct the index of similarity are the investments in ICT, the internal and external

R&D investments, the ratio between the number of skilled workers and the number of unskilled workers (the skilled workers are those with at least a high school diploma) and the Herfindhal–Hirschmam index, as a measure of market concentration in the sector which the firm belongs to.

150

F. Aiello, P. Cardamone

Lu et al. (2005). To the best of our knowledge no paper focuses on Italian manufacturing firms. A way to weight the diffusion of innovation among firms located in different areas is to take into account the geographical distance between them. At this end we use the great circle formula, which yields the shortest distance between two points on a sphere. Within each pair of firms i and j, the distance between them (di j ) is calculated by considering the distance between the administrative capital of the provinces where the firms operate. Given the distance (di j ) between a pair of firms, an index of the geographical proximity is: gi j = 1 −

di j max(di j )

(8)

which is unity when di j = 0, namely when firms i and j are in the same province, and is zero when di j = max(di j ), that is when the distance between i and j equals the maximum distance, which, in Italy, is given by the distance from Aosta to Siracusa. Finally, we attempt to merge the basic ideas beyond the Eqs. (7) and (8). In Eq. (7), we simply state that technological similarity is the only factor that explains the flow of innovation, while Eq. (3) attributes this uniqueness to geographical proximity. Since it is likely that the closer and more similar firms are, the more they benefit from each other’s technology, we average the indexes ωˆ i jt and gi j : ⌢

vi jt

ωi jt + gi j = 2

(9)

The index vi jt is asymmetric and ranges from zero to one. It is zero when firm i and firm j are both technologically and geographically “dissimilar”, while it is unity if the closeness of the pair (i, j) is unity in both dimensions (technology and geography). It is worth emphasising that the assumption behind the Eq. (9) is that the weights for both indexes are equal.9 All the weighting systems have been used to determine the stock of R&D spillovers [Spillit in Eq. (1)]. For the ith firm, the variable Spillit is given by: Spillit =

N

υi jt C T jt with i = 1, 2, . . . , N

(10)

j =1 j = i 9 The assumption made in Eq. (9) has been imposed by the fact that we have no prior information con-

cerning the relative importance of technological similarity with respect to geographical proximity in the diffusion of knowledge. Hence the results we obtain using Eq. (9) must be used with caution and considered only a first step in a promising line of research. A natural extension of our study could be the estimation of the translog production function by including as regressors two distinct measures of R&D spillovers (the ones obtained using the technological similarity and the geographical distance) instead of the one that combines the weights. Although this is a fashionable idea, it can not be implemented within our framework of analysis because we use a system of equations and, thus, we have the constraint to identify the cost share equation (Eqs. 2–5). In other words, if we used two measures of R&D spillovers, then, we would include, in the system of equations, the cost share equations of R&D spillovers. This is a hard task, because the costs of R&D spillovers are not observable.

R&D spillovers and firms’ performance in Italy

151

where υijt indicates a generic weighting system as expressed in Eqs. (6)–(9) and C T j t is the R&D stock of capital of the jth firm. We construct four stocks of R&D spillovers. Firstly, we refer to the stock of R&D spillovers obtained considering the symmetric similarity approach (υi jt = ωi jt ). Secondly, we compute the R&D spillovers using the asymmetric similarity index (υi jt = ωˆ i jt ). Thirdly, we calculate the flows of innovation through the index of geographical proximity (υi jt = gi j ). Finally, we refer to the average of geographical and technological distance (υi jt = vi jt ). 4 Data In the empirical analysis, we use a firm-level dataset extracted from the eighth and ninth “Indagine sulle imprese manifatturiere” (“Survey of manufacturing companies”) (IMM) surveys made by Capitalia (2002, 2005). These two surveys cover the period 1998–2003, contain standard balance sheets and collect a great deal of qualitative information from a large sample of Italian firms. Each survey considers more than 4,500 firms, including all Italian manufacturing firms with more than 500 workers and a representative sub-sample of firms with more than 10 workers (stratification used by Capitalia considers location, size and sector of the firm). After data cleaning we obtain a panel of 7,218 observations, with large N (1,203 cross sections) and small T (6 years). We split the sample into R&D and non-R&D performing firms. The first group is composed of firms with positive R&D investments. Data in Fig. 1 indicate the dynamics of the number of R&D performers over the period 1998–2003. In 1998, there were 385, that is 32% of the entire sample of firms we consider, whereas this proportion was 36%, (430 firms) in 2003. Table 1 shows a breakdown of the sample of firms in 2003.10 Regarding the geographical location, in 2003 about two-thirds of firms were located in northern Italy (445 in the North West and 382 in the North East). At the industry level, the full sample is dominated by firms in base metals, non-electrical machinery and textiles industries, while the petroleum refinery industry is represented by 6 firms only. In the case of R&D performers, most firms are located in northern Italy (177 out of 430 entities) and the industries with the highest number of companies are the non-electrical (102 firms) and the electrical machinery (61 firms) sectors. Finally, the distribution of firms by size is strictly in line with that of the entire Italian industrial system, which is characterized by the massive presence of small and medium sized firms (Table 1). 0,38 0,36 0,34 0,32 0,30 Share

1998 0,32

1999 0,32

2000 0,33

2001 0,34

2002 0,35

2003 0,36

Fig. 1 Proportion of R&D performing firms in the full sample for each year from 1998 to 2003

10 We limit the description of the sample to 2003 because the distributions of Italian firms by size, sector

and over the period 1998–2002 are similar to those of 2003 (data are available upon request).

152

Table 1 Italian manufacturing firms by area, industry and size in 2003 Sector

Total sample 11–20 E

R&D performing firms 21–50 E

51–250 E

>250 E

Total

11–20 E

21–50 E

51–250 E

>250 E

Total

Food, beverages and tobacco

45

42

13

3

103

8

11

4

2

25

Textiles and apparel

54

51

30

13

148

17

15

12

7

51

Leather

20

16

14

0

50

3

3

9

0

15

Wood products and furniture

13

23

10

1

47

1

3

5

1

10

Paper, paper prod. and printing

35

20

10

3

68

4

5

1

0

10

4

2

0

0

6

0

1

0

0

1

27

15

10

3

55

12

6

8

3

29

Petroleum refineries and product Chemicals Rubber and plastic products

21

27

13

4

65

4

10

9

4

27

Non-metallic mineral products

33

32

11

5

81

9

3

6

1

19

Basic metal and fab. met. prod.

84

72

31

6

193

11

16

10

5

42

Non-electrical machinery

43

60

56

15

174

12

34

43

13

102

Electrical machinery and electronics

27

42

21

10

100

11

22

19

9

61

9

8

5

5

27

2

2

2

4

10

37

30

18

1

86

9

8

10

1

28

Motor vehicles and other transport equipment Other manufacturing industries Area

178

136

98

33

445

44

46

62

25

177

North East

142

139

79

22

382

36

47

51

19

153

Centre

89

91

37

10

227

15

28

17

4

64

South

43

74

28

4

149

8

18

8

2

36

Total

452

440

242

69

1, 203

103

139

138

50

430

Source: Our calculation from data by Capitalia (2005) E = Employees

F. Aiello, P. Cardamone

North West

R&D spillovers and firms’ performance in Italy

153

Table 2 reports the labour productivity and the physical and technological capital intensities. Labour productivity is expressed as value added to employees, whereas both capital intensities are computed with respect to value added. Data are expressed at 2000 real prices and refer to 2003.11 Results reveal that labour productivity is 67,000 euros for the entire sample of firms, which is a higher value than that (63,000 euros) observed for R&D performers. This discrepancy depends on the productivity in the Centre (90,000 euros) and in the South (68,000 euros) of Italy. Moreover, these figures are driven by the high level of productivity of one firm with 21–50 workers operating in the petroleum industry and by two firms with more than 250 workers belonging to the paper sector. If we exclude these firms the differences in labour productivity decrease. The comparison of results obtained when classifying firms by size and sectors indicates that in many cases the productivity of small and medium sized R&D performers is higher than the average production of the entire sample.12 Such evidence seems to suggest that the small and medium sized R&D performers obtain a high level of productivity because they compensate for the diseconomies of scale by exploiting the advantages of being innovative. As for physical capital intensity, we find that it is 1.31 for the total sample of firms and 1.27 for the R&D performers. What clearly emerges is that firms located in the South of Italy of a size of up to 250 workers have a physical capital intensity which is much higher than that observed for firms in the other areas. We find a confirmation of the overcapitalization of southern firms which the literature ascribes to the Italian policies addressed at helping the poorest areas of the country by making grants aimed at factor accumulation. Bearing in mind the specific aim of this paper, the analysis of R&D capital intensity is of great interest. At a national level, it is 0.33 for all the R&D performing firms; moreover firms operating in North West of Italy register a value (0.42) which is higher than the national average, while in the other areas the R&D intensity is low (0.36 in the North East, 0.22 in the Centre and 0.05 in the South). The R&D intensity strongly differs when one considers firm size: it is 0.38 in the case of the firms with more than 250 employees, 0.33 for small firms (11–50 workers) and 0.2 for medium ones (51– 250 employees). Finally, intensity is high in the chemical (0.82), electrical (0.54) and non-electrical (0.37) sectors and low in the wood (0.04) and paper (0.05) industries. These findings show that innovative activities are concentrated in the northern regions of the country, whose local industrial system is dominated by the presence of large and very innovative firms. They are further evidence supporting the map of innovative activity in Italy (see, among many others, Breschi and Lissoni 2001).13 11 Weights are given by f = F it it

2003 N F where F is the sales of the ith firm at time it t=1998 i=1 it

t (t = 1998, . . . , 2003) belonging to a group sized N (i = 1, . . . , N ).

12 This holds, for example, in the case of the small (11–20 workers) and/or medium (21–50 workers) firms

active in the food, textiles, paper, rubber, electrical, non metallic, non electrical and motor vehicles sectors. 13 The high level of R&D capital intensity in the same regions of the country and in specific sectors finds

an explanation in the high concentration of R&D investments. Indeed, 5% of the sample, that is 28 firms, accounted for 71% of total R&D investments in 2003. This 5% of firms invests, on average, more than five million euro per year and is geographically very concentrated (25 out of 28 firms are located in northern Italy). Again, 20% of the sample is composed of 111 firms and accounts for 89% of R&D investments (for ease of exposition, data are available on request only).

154

Table 2 Labour productivity and factor intensity in Italian manufacturing firms by industry, area and size in 2003 (weighted average) Sector

Total sample Y/L 11–20 E

K/Y 21–50 E

51–250 E

>250 E

Total

11–20 E

21–50 E

51–250 E

>250 E

Total

Food, beverages and tobacco

43

54

43

61

53

3.37

2.50

1.19

4.45

3.12

Textiles and apparel

52

49

41

61

55

0.69

0.97

1.33

1.28

1.19

Leather

40

38

43

−

41

0.68

0.58

1.36

−

1.09

Wood products and furniture

34

43

37

72

44

0.86

1.76

1.26

1.46

1.43

Paper, paper prod. and printing

40

56

50

183

147

0.90

0.89

1.25

0.73

0.82

Petroleum refineries and product

74

260

−

−

229

1.32

1.92

−

−

1.82

Chemicals

69

78

67

74

73

1.09

1.40

1.30

0.59

0.83

Rubber and plastic products

49

45

60

81

67

0.95

1.19

2.79

1.62

1.79

Non-metallic mineral products

60

53

51

85

72

1.67

1.99

2.10

1.56

1.72

Basic metal and fab. met. prod.

63

46

73

60

62

1.23

1.34

1.21

1.74

1.36

Non-electrical machinery

49

56

64

65

63

0.57

0.55

0.76

0.90

0.80 0.85

Electrical machinery and electronics

40

44

59

51

51

0.63

0.56

0.67

1.02

Motor vehicles and other transport equipment

39

42

35

58

55

0.71

0.87

0.99

1.56

1.48

Other manufacturing industries

47

36

40

38

40

0.79

0.85

1.05

0.78

0.91

North West

54

50

57

66

61

0.95

1.31

1.07

1.18

1.14

North East

48

53

54

72

62

1.02

1.08

0.92

1.82

1.40

Area

58

50

60

120

90

1.03

0.87

1.30

1.05

1.06

40

88

50

69

68

3.28

1.96

2.82

1.47

2.12

Total

52

58

55

79

67

1.27

1.27

1.18

1.38

1.31

F. Aiello, P. Cardamone

Centre South

Sector

R&D performing firms Y/L

K/Y

CT/Y

11–20 E 21–50 E 51–250 E >250 E Total 11–20 E 21–50 E 51–250 E >250 E Total 11–20 E 21–50 E 51–250 E >250 E Total Food, beverages and tobacco

49

62

49

57

56

1.60

2.23

1.11

4.68

3.60 0.16

0.07

0.12

0.10

0.10

Textiles and apparel

49

52

42

68

62

0.82

1.00

1.22

1.10

1.09 0.40

0.27

0.27

0.07

0.14

Leather

45

38

47

−

46

0.62

0.59

1.51

1.01

1.40 0.28

0.23

0.07

0.01

0.09

Wood products and furniture

32

48

40

79

51

0.89

1.22

1.38

−

1.24 0.11

0.07

0.03

−

0.04

Paper, paper prod. and printing

51

57

60

195

133

0.69

0.72

1.64

0.51

0.82 0.14

0.07

0.09

0.01

0.05

Petroleum refineries and product

53

98

−

−

85

1.17

0.51

−

−

0.69 0.04

0.18

−

−

0.14

Chemicals

61

110

68

74

74

1.12

1.01

1.37

0.59

0.75 0.66

0.81

0.31

0.94

0.82

Rubber and plastic products

70

54

60

81

73

1.10

1.37

1.74

1.62

1.61 0.39

0.17

0.33

0.41

0.37

Non-metallic mineral products

41

60

52

78

71

1.13

1.75

1.48

1.38

1.41 0.14

0.19

0.14

0.08

0.10

Basic metal and fab. met. prod.

48

42

47

62

55

1.10

1.14

1.51

1.32

1.34 0.17

0.09

0.17

0.52

0.35

Non-electrical machinery

58

59

66

63

64

0.50

0.55

0.77

0.88

0.81 0.34

0.32

0.22

0.48

0.37

Electrical machinery and electronics 46

47

58

49

51

0.77

0.47

0.67

1.03

0.87 0.24

0.27

0.43

0.65

0.54

Motor vehicles and other transport equipment

46

35

36

61

59

1.13

2.27

0.95

1.30

1.29 0.09

0.06

0.26

0.23

0.23

Other manufacturing industries

81

39

41

38

43

0.48

0.95

1.16

0.78

0.97 0.15

0.17

0.16

0.28

0.19

52

53

60

65

62

0.90

1.13

1.06

1.09

1.08 0.27

0.40

0.28

0.49

0.42

R&D spillovers and firms’ performance in Italy

Table 2 continued

Area North West North East

56

63

48

68

63

0.82

0.53

0.96

1.86

1.52 0.27

0.23

0.17

0.36

0.30

Centre

58

55

69

65

64

0.82

0.98

1.38

1.11

1.15 0.54

0.18

0.18

0.22

0.22

South

39

43

53

75

66

1.47

2.49

1.61

1.02

1.32 0.21

0.07

0.29

0.05

0.10

Total

53

56

56

67

63

0.91

1.00

1.09

1.40

1.27 0.33

0.25

0.23

0.38

0.33

155

Source: See Table 1. Weights are expressed as the sales of the ith firm in relation to the aggregate sales of the group Y/L value added/employee (in .000 of Euro), K/Y physical capital/value added, CT/Y technological capital/value added, E = Employees

156

F. Aiello, P. Cardamone

5 Results 5.1 Output elasticities Although the individual coefficients of the regressors in the translog production function are not directly interpretable, the 3SLS estimates merit a very brief comment. From data reported in Appendix A, it emerges that the majority of the interactive terms and the squared variables of the translog is significant. This means that if we used a Cobb– Douglas production function, we would introduce a bias in the estimations due to the omission of relevant variables. We focus the discussion on the output elasticity retrieved from the estimates of the translog production. Data reported in column 1 of Table 3 show the outcomes that we obtain using the symmetric index of technological similarity, while, in columns 2, 3 and 4, we present the findings associated to the other weighting systems (the asymmetric index of technological similarity, the index of geographical proximity and their average, respectively). We expect that the method of weighting the flow of innovation matters in determining the impact of R&D spillovers on firms’ output. The first finding is that all the output elasticities are positive and highly significant. With regards labour, we find that the elasticity varies around 0.39, except when we use the asymmetric index of similarity. In this case, it is 0.49. The same behaviour is found for elasticity with respect to physical capital (0.23 in column 2) and R&D capital (0.14). Finally, the magnitude of the impact of R&D spillovers on the level of firm production is high: the elasticity is roughly 0.3, but it is as low as 0.14 when the flow of innovation is weighted through the pure asymmetric index of technological similarity

Table 3 Output elasticity in Italian R&D performing firms Output elasticity

L K CT Spill

Italy Symmetric techn. Spill. υi j = ωi j

Asymmetric techn. Spill. υi j = ω˜ i j

Geograph. Spill υi j = gi j

Asymm. techn. and geogr. Spill. υi j = νi j 0.3763***

0.4175***

0.4873***

0.3718***

(0.004)

(0.00347)

(0.00471)

(0.00462)

0.1986***

0.2324***

0.1693***

0.1716***

(0.00315)

(0.00272)

(0.00366)

(0.00358) 0.1045***

0.1120***

0.1438***

0.1063***

(0.00205)

(0.00177)

(0.00251)

(0.0024)

0.2718***

0.1364***

0.3526***

0.3476***

(0.00805)

(0.00639)

(0.0099)

(0.00964)

3SLS estimations (1998–2003) Note: Standard errors reported in brackets *** Statistical significance at 1% level

R&D spillovers and firms’ performance in Italy

157

(column 2, Table 3).14 These outcomes confirm the hypothesis that elasticities vary according to the procedure used to weight technology flows. More specifically, there are substantial differences in results when considering the two preferred methods of weighting (the asymmetric technological and the geographical indexes). Without considering geography, from regressions run using the asymmetric index of technological similarity, we find that the elasticity of external technology is 0.13 and that the major contribution to production comes from factors under the direct control of the firm (labour, physical and own R&D capital). It is clear that this system is generally better than that one obtained using the symmetric index of similarity. However, the flow of innovation across firms is not determined only by technological similarity, but also depends on geographical proximity. This is particularly true in Italy, where localised knowledge spillovers (Breschi and Lissoni 2001) play a key role in technology transmission in many areas of the country characterised by the presence of agglomerations of small and medium sized firms that are similar and highly connected (the so called Italian industrial districts). Social interaction in these areas (Guiso and Schivardi 2007) creates the local conditions which foster the circulation of knowledge, in line with the evidence provided by Jaffe et al. (1993). In our perspective, the estimated output elasticity (0.35) of R&D spillovers obtained using the spatial proximity (Table 3, column 3) stresses the role of geography as a vehicle for enhancing the transmission of technology and the firm’s performance. These results are robust according to the evidence which one obtains by sub-aggregating the firms according to their location (Table 4). Indeed, the impact of R&D spillovers on production is always higher in regressions using geographical proximity than in regressions based on the asymmetric technological similarity, whatever the area of the country (North West, North East and Centre-South). However, analysis carried out area-by-area highlights some differences. Focusing on R&D capital and R&D spillovers and looking at regressions using asymmetric technological proximity, we find that the elasticity of R&D capital slightly differs area-by-area (it ranges from 0.138 to 0.158). On the other hand, external technology exerts a marginal effect on production in the Centre and in the South of Italy compared to that estimated for the North of the country: R&D spillover elasticity is high in the northern regions (0.17 in the North-West and 0.14 in the North East) and very low (0.09) in the Centre-South of the country. Whatever the distance between firms, this result is in line with the evidence according to which, in the Centre-South of the country, innovative efforts are low and the local industrial systems are comprised of strongly dissimilar companies, i.e., in terms of human capital, market and product orientation, technology. Such peculiarities reduce firms’ capacity to absorb external knowledge and, as a consequence, the role of R&D spillovers is found to be limited. Previous findings are subject to criticism, because firms’ exposure to R&D spillovers relies on technological similarity or on geographical proximity, only. Disregarding a channel (similarity or proximity) through which technology spills over from one 14 A high spillover elasticity, about 0, 60, is also obtained by Cincera (2005) and Los and Verspagen (2000).

The sample analyzed by Cincera (2005) is composed of large firms and the period considered is 1987–1994. Los and Verspagen (2000) use a panel of USA manufacturing firms from 1977 to 1991. In both papers, the weighting system is the uncentered correlation calculated considering patent data and the production function is the Cobb–Douglas.

158

Table 4 Output elasticity in Italian manufacturing firms by area. 3SLS estimations (1998–2003) Output elasticity

L K CT Spill

North-West

North-East

Asymmetric techn. Spill.

Geograph. Spill

υi j = ω˜ i j

Centre-South

Asymmetric techn. Spill.

Geograph. Spill

υi j = gi j

Asymm. techn. and geogr. Spill. υi j = νi j

Asymmetric techn. Spill.

Geograph. Spill

υi j = gi j

Asymm. techn. and geogr. Spill. υi j = νi j

υi j = ω˜ i j

υi j = gi j

Asymm. techn. and geogr. Spill. υi j = νi j

υi j = ω˜ i j

0.4793***

0.4227***

0.4252***

(0.00582)

(0.00691)

(0.00693)

0.4880***

0.3838***

0.3849***

0.4975***

0.3467***

0.3490***

(0.0055)

(0.00713)

(0.0073)

(0.00717)

(0.01068)

0.2106***

0.1716***

(0.01041)

0.1713***

0.2119***

0.1551***

0.1536***

0.2620***

0.1714***

(0.00444)

0.1731***

(0.00557)

(0.0053)

(0.00397)

(0.006)

(0.00587)

(0.00523)

(0.00811)

(0.00792)

0.1383***

0.1235***

0.1211***

0.1581***

0.1199***

0.1163***

0.1406***

0.0832***

0.0817***

(0.00324)

(0.00407)

(0.00392)

(0.00288)

(0.00433)

(0.00423)

(0.00309)

(0.00514)

(0.00493)

0.1719***

0.2823***

0.2824***

0.1419***

0.3412***

0.3453***

0.0998***

0.3987

0.3961***

(0.01145)

(0.01408)

(0.01394)

(0.00975)

(0.01437)

(0.01436)

(0.01412)

(0.0226)

(0.02193)

Note: Standard errors reported in brackets

F. Aiello, P. Cardamone

*** Statistical significance at 1% level

R&D spillovers and firms’ performance in Italy

159

firm to another is a severe shortcoming of the analysis. We attempt to overcome this caveat by averaging the indeces of technological similarity and geographical proximity (see Eq. 9 and note 9). The estimates for the entire sample of firms indicate that the output elasticity of the internal R&D capital is 0.1, while that of external technology, namely the impact on production of R&D spillovers, is 0.34 (Table 3). We find clear evidence that the production of the Italian manufacturing sector is strongly dependent on the technology that firms absorb from other firms. This finding also holds when the level of aggregation is at sub-national level. In fact, the highest value (0.39) for the elasticity of technological spillovers has been estimated for firms located in the CentreSouth of Italy, while, in the other regions, the elasticity is 0.34 in the North-East and 0.28 in the North-West. From one region to another, we also observe different results for the elasticity of internal R&D capital, which varies from 0.12 in the North West to 0.08 in the Centre-South (Table 4). Thus, in order to perform well, firms in the central and southern Italian regions (i) use R&D spillovers to compensate for the low level of internal innovative efforts and (ii) are able to gain great advantages from external technology. In order to better understand the relationships between factors, the next subparagraph highlights the estimates of the elasticity of substitution proposed by Morishima (1967). 5.2 Elasticity of substitution This section focuses on the degree of substitutability/complementary among inputs. This is done by considering the elasticity of substitution proposed by Morishima (1967), which is a precise measurement of how the s, k input ratio responds to a change in the kth price (Celikkol and Stefanou 1999). Given this definition it follows that (a) two inputs are substitutes (complements) when the Morishima elasticity is positive (negative) and (b) Morishima elasticity of substitution is not symmetric. Table 5 shows the results of the estimated elasticities of substitution obtained when the flow of R&D capital is weighted by using the method that combines firms’ technological similarity and geographical proximity (cfr Eq. 9). For any pair of inputs, we report the estimated value of elasticity, its standard error and the t-statistics used to test the null hypothesis that the elasticity is unity. The first outcome to be emphasized is that almost all elasticities are significantly different from unity.15 This evidence supports our choice to use the translog instead of the Cobb–Douglas specification which, on the contrary, assumes elasticity to be unity. Moreover, it is worth noticing that labour and physical capital and labour and R&D spillovers are Morishima substitutes, while R&D capital and R&D spillovers are Morishima complements, whatever the change in price. As for the other pairs of inputs, because of asymmetry of the Morishima index, the sign of the relationship depends on which price changes. Again, we find evidence of complementarity of internal R&D 15 The hypothesis that Morishima elasticity is unity is always rejected when analysing the entire sample

of firms and the sub-sample of firms localized in the Centre-South of Italy, while it is not rejected in three cases in the North-West (R&D capital and R&D spillovers, R&D capital and labour, physical capital and R&D capital) and in one case in the North East of Italy (R&D capital and labour) (Table 5).

160

F. Aiello, P. Cardamone

Table 5 Morishima elasticity of substitution by geographical areas (as a mean average of the sample) over the period 1998–2003 Morishima elasticity of substitution Italy Sp and L

Sp and K

Sp and CT

L and Sp

K and Sp

CT and Sp

CT and L

CT and K

L and CT

K and CT

L and K

K and L

North West

North East

Centre-South

0.667***

1.094***

1.179***

0.457***

(0.019)

(0.0212)

(0.026069)

(0.046)

−(17.3536)

(4.4195)

(6.8767)

−(11.8009)

0.050

0.481***

0.252**

−0.262***

(0.051)

(0.0696)

(0.104284)

(0.0999)

−(18.5856)

−(7.4592)

−(7.1762)

−(12.634)

−2.596***

−0.809**

−0.382

−6.767***

(0.3028)

(0.3796)

(0.6436)

(1.0884) −(7.1364)

−(11.8776)

−(4.7653)

−(2.1472)

0.526***

1.074***

1.055***

0.223***

(0.0317)

(0.0128)

(0.0136)

(0.0814)

−(14.935)

(5.8064)

(4.0263)

−(9.5467)

−0.319***

0.557***

−0.224

−0.850***

(0.1015)

(0.09809)

(0.13802)

(0.23589) −(7.844)

−(12.9949)

−(4.5201)

−(8.8686)

−7.614***

0.116

−1.681**

−22.303***

(0.7322)

(0.5083)

(0.7934)

(2.5931)

−(11.7655)

−(1.739)

−(3.3793)

−(8.9866)

3.619***

0.720

1.607*

9.157***

(0.4563)

(0.5316)

(0.8669)

(1.5381)

(5.7405)

−(0.5266)

(0.7001)

(5.3032)

2.116***

−0.071

1.123*

6.574***

(0.331)

(0.3597)

(0.6346)

(1.1064)

(3.3729)

−(2.9774)

(0.1941)

(5.0383)

−2.521***

−0.811**

−0.366

−6.629***

(0.2992)

(0.3817)

(0.6459)

(1.0827)

−(11.768)

−(4.7443)

−(2.1148)

−(7.046)

−2.391**

−0.870**

−0.295

−6.387***

(0.3016)

(0.3878)

(0.6567)

(1.0766)

−(11.2423)

−(4.8227)

−(1.9726)

−(6.8616)

0.116**

0.502***

0.360***

−0.166*

(0.0488)

(0.0739)

(0.1106)

(0.0859) −(13.5749)

−(18.1116)

−(6.7374)

−(5.7835)

0.831***

1.079***

1.568***

0.666***

(0.0561)

(0.0949)

(0.1292)

(0.1141)

−(3.0183)

(0.8305)

(4.3994)

−(2.9295)

Data refers to the results obtained using Eq. 9 as weighting system of R&D spillovers Note: Standard errors reported in brackets. The second row of standard errors refers to the t-test H0 : σi j = 1 *, **, *** Statistical significance at the 10, 5 and 1% level, respectively

R&D spillovers and firms’ performance in Italy

161

capital with respect to all other inputs (K , L , Spill) when we consider a change in the of internal technology. This means that a decrease in internal R&D capital price induces an increase in the relative use of other inputs. Table 5 reveals that the sign of Morishima elasticity does not vary when the analysis is carried out at sub-national level. The only two exceptions are the positive sign, although not significant, of the Morishima index between internal and external R&D capital in the North-West of Italy and the negative sign related to the input ratio between R&D spillovers and physical capital in the Centre-South of the country. However, several differences emerge when looking at the magnitude of the Morishima elasticity of substitution estimated for the three macro-areas we analyse. We find that the sample of firms located in the CentralSouthern regions show a higher value of the elasticity between internal and external R&D capital than that estimated in the other areas. In particular, the internal R&D capital/R&D spillover ratio in this area of the country is very sensitive to the price of R&D spillovers: a decrease of 1% in the price of R&D spillovers yields an increase of 22.3% in the input ratio. In terms of policy implications, these results indicate that policies resulting in lowering the price of R&D capital (both internal and external) would cause significant increase in the use of technology and ultimately an improvement in firms’ performance. Given the estimated values of Morishima elasticity, the economic gains of such policies would be higher in the Centre-South than in the North of Italy. The low level of R&D intensity in Italy and, in particular, in the Centre and South of the country, allows the Italian government to easily implement public intervention in promoting R&D activities.

6 Conclusions Compared to the existing empirical literature on the role of R&D spillovers at firm level, this paper provides two original contributions. The first deals with the functional form to be used in modelling the impact of R&D on production, whereas the second concerns the use of different measures of R&D spillovers. As far as functional form is concerned, we use the translog production function, which is more flexible than the Cobb–Douglas. The results support our choice, because we reject the assumption inherent the technology of a Cobb–Douglas. It is worth noting that the literature to which this paper refers never uses the translog function and generally omits the testing of the suitability of the Cobb–Douglas specification. With regards R&D spillovers, we consider and compare different measurement methods of external technology. This procedure helps to understand whether the role of R&D spillovers is sensitive to the method used to weight the flows of innovation across firms. To be precise, in order to determine the R&D spillovers stock we use a measure of similarity between firms. It is assumed that the greater the similarity between two firms in terms of size and R&D efforts, the more they will absorb each other’s technology. To overcome the problem that the similarity index produces a symmetric weighting scheme, we consider an asymmetric transformation of the uncentered correlation. We also test the hypothesis that the closer two firms are, the more they will benefit from each other’s R&D. This is done through a spatial weighting scheme based on the great circle distance between firms.

162

F. Aiello, P. Cardamone

In the econometric section we control for selection bias by using the 2-steps IV estimator, where, in the first step, we model the selection model that leads the firms to invest, or not, in R&D. In the second step, we estimate the translog production function with the 3SLS method. Data are from Capitalia and refer to a balanced panel data of 1,203 manufacturing firms over the period 1998–2003. The key result is that output elasticity with respect to R&D spillovers is always positive and significant. Moreover, we find that different measurement methods of spillovers bring about different effects of inputs on firm output. In fact, we show that geographical proximity is relevant in determining the final result: our regressions based only on the asymmetric index of technological similarity underestimate the impact of R&D spillovers. These regressions do not control for geographical distance and, thus, are less precise in measuring firms capacity to absorb technology. Finally, from a regional point of view, it emerges that the role of external technology is higher in the Centre and South of Italy than in the North of the country. As for the elasticity of substitution among inputs, we find clear evidence that R&D spillovers are Morishima complements to the internal stock of R&D capital. A joint reading of this result and of that concerning the positive impact of R&D capital on firms’ production advocates great public intervention aimed at encouraging the adoption and diffusion of technology in the Italian manufacturing sector. Appendix A Tables 6 and 7. Table 6 Results on the probability of investing in R&D for Italian manufacturing firms

ln(H)

Symmetric technol. Spill. υi j = ωi j

Asymmetric technol. Spill. υi j = ω˜ i j

Geograph. Spill. υi j = gi j

Asymmetric technol. and geograph. Spill. υi j = νi j

0.0152 (0.003)***

−0.0181 (0.004)***

0.0226 (0.003)***

−0.0074 (0.003)**

ln(cf)

−0.0150 (0.03)

0.0481 (0.025)*

0.0635 (0.025)**

0.0405 (0.026)

D_exp

0.4117 (0.064)***

0.5198 (0.058)***

0.6043 (0.057)***

0.5727 (0.059)***

ln(ict)

0.1567 (0.025)***

0.1595 (0.022)***

0.1582 (0.021)***

0.1726 (0.022)***

North-West

−0.0578 (0.108)

−0.0097 (0.095)

−0.2835 (0.141)**

−1.2155 (0.132)***

North-East

0.1028 (0.106)

0.2037 (0.094)**

−0.1192 (0.138)

−0.9788 (0.127)***

Centre

0.1465 (0.116)

0.4111 (0.104)***

0.0162 (0.13)

−0.5712 (0.125)***

Scale

0.2264 (0.083)***

0.0297 (0.07)

0.0655 (0.068)

0.0408 (0.07)

Specialized

0.3112 (0.069)***

0.2959 (0.059)***

0.4272 (0.056)***

0.3462 (0.059)***

High-tech

0.4311 (0.167)***

0.5656 (0.132)***

0.7353 (0.126)***

0.6637 (0.14)***

ln(k)

−0.3515 (0.547)

0.3191 (0.176)*

−0.5131 (1.18)

0.6088 (1.138)

ln(l)

1.9218 (1.063)*

0.2596 (0.295)

−1.0138 (2.226)

−3.3991 (2.166)

ln(sp)

−28.0632 (1.25)***

−1.2979 (0.127)***

−7.3220 (5.588)

−50.1745 (4.37)***

ln(l)ln(k)

−0.0526 (0.032)

−0.0286 (0.03)

−0.0188 (0.026)

−0.0312 (0.03)

ln(l)ln(sp)

−0.1443 (0.084)*

0.0458 (0.021)**

0.1317 (0.165)

0.3131 (0.167)*

R&D spillovers and firms’ performance in Italy

163

Table 6 continued Symmetric technol. Spill. υi j = ωi j

Asymmetric technol. Spill. υi j = ω˜ i j

Geograph. Spill. υi j = gi j

Asymmetric technol. and geograph. Spill. υi j = νi j

ln(k)ln(sp)

0.0369 (0.043)

−0.0197 (0.012)

0.0424 (0.087)

−0.0434 (0.087)

[ln(l)]2

0.0845 (0.082)

−0.0794 (0.071)

−0.1017 (0.065)

−0.0609 (0.071)

[ln(k)] 2

0.0155 (0.018)

−0.0055 (0.018)

−0.0022 (0.016)

0.0084 (0.018)

[ln(sp)] 2

2.4954 (0.113)***

0.1658 (0.012)***

0.5373 (0.428)

4.0965 (0.34)***

152.66 (7.187)***

1.65 (1.11)

44.74 (36.845)

304.42 (28.705)***

Wald test

3595

3595

3595

3595

p-value

910.25

909.38

772.67

860.07

Pseudo R 2

0.00

0.00

0.00

0.00

Probit estimates over the period 1998–2003 Note: Standard errors in brackets H human capital, cf cash flow, D_exp dummy equal to one if the firm exports and zero otherwise, ict ICT investments, k physical capital, l labour, sp spillovers, sectoral (according to the Pavitt classification: traditional, scale, specialized and high technological industries) and territorial (North-West, North-East, Centre and South) dummies ***, **, * Statistical significance at 1, 5 and 10%, respectively Table 7 Estimated coefficients of the translog production function for Italian manufacturing firms Symmetric technol. Spill. υi j = ωi j

Asymmetric technol. Spill. υi j = ω˜ i j

Geographical Spill. υi j = gi j

Asymmetric technol. and geograph. Spill. υi j = νi j 0.9231 (0.0317)***

αL

0.9491 (0.0306)***

0.7183 (0.0167)***

0.9501 (0.0324)***

αK

0.4412 (0.0225)***

0.3084 (0.0124)***

0.4324 (0.0238)***

0.4244 (0.0231)***

αC

0.2826 (0.015)***

0.2302 (0.0081)***

0.3020 (0.017)***

0.2945 (0.0161)***

αSp

−0.6729 (0.0593)*** −0.2569 (0.0294)*** −0.6844 (0.0644)*** −0.6419 (0.0624)***

βLK

0.0044 (0.0019)**

−0.0030 (0.0016)*

0.0020 (0.0018)

0.0028 (0.0018)

βLC

0.0088 (0.0013)***

0.0057 (0.0011)***

0.0076 (0.0013)***

0.0084 (0.0013)***

βKC

0.0058 (0.001)***

0.0016 (0.0009)*

0.0058 (0.001)***

0.0059 (0.001)***

βLSp

−0.0659 (0.0041)*** −0.0336 (0.0026)*** −0.0606 (0.004)***

−0.0614 (0.0041)***

βKSp

−0.0425 (0.0031)*** −0.0238 (0.002)***

−0.0395 (0.0031)***

βCSp

−0.0246 (0.002)***

−0.0157 (0.0013)*** −0.0239 (0.0021)*** −0.0248 (0.0021)***

βLL

0.0527 (0.0029)***

0.0308 (0.0022)***

0.0510 (0.0028)***

0.0501 (0.0029)***

βKK

0.0323 (0.0017)***

0.0252 (0.0015)***

0.0303 (0.0016)***

0.0307 (0.0017)***

βCC

0.0101 (0.0009)***

0.0084 (0.0008)***

0.0105 (0.0009)***

0.0104 (0.0009)***

βSpSp

0.133 (0.0085)***

0.0731 (0.0053)***

0.1225 (0.0083)***

0.1256 (0.0086)***

Scale

0.1213 (0.0362)***

0.0927 (0.0358)***

−0.0381 (0.003)***

0.0662 (0.036)*

0.1006 (0.0357)***

Specialized 0.1865 (0.0277)***

0.0877 (0.0276)***

0.1904 (0.0273)***

0.1722 (0.0273)***

High-tech

0.0055 (0.047)

0.2156 (0.0466)***

0.1813 (0.0467)***

North-West 0.1151 (0.0486)**

−0.0180 (0.0483)

−0.0293 (0.048)

−0.0326 (0.0482)

North-East 0.1603 (0.0486)***

0.1170 (0.0481)**

−0.0028 (0.0481)

0.0135 (0.0483)

0.198 (0.0472)***

164

F. Aiello, P. Cardamone

Table 7 continued

South

Symmetric technol. Spill. υi j = ωi j

Asymmetric technol. Spill. υi j = ω˜ i j

Geographical Spill. υi j = gi j

Asymmetric technol. and geograph. Spill. υi j = νi j

0.0979 (0.0523)*

0.1249 (0.0518)**

−0.0454 (0.0517)

−0.0205 (0.0519)

Obs.

1,537

1,537

1,537

1,537

F-test

10,288.9

8,080.4

51,791.5

55,419.9

Prob > F

0.00

0.00

0.00

0.00

R-squared

0.83

0.85

0.84

0.83

Estimation Method: 3SLS (1998–2003) Note: Standard errors reported in brackets *, **, *** Statistical significance at the 10, 5 and 1 level, respectively. The instrumental variables considered are the 1-year lagged values of the endogenous variables

References Adams JD, Jaffe AB (1996) Bounding the effect of R&D: an investigation using matched establishment-firm data. RAND J Econ 27:700–721 Aiello F, Cardamone P (2005) R&D spillovers and productivity growth. Further evidence from Italian manufacturing microdata. Appl Econ Lett 12:625–631 Aiello F, Pupo V (2004) Il tasso di rendimento degli investimenti in Ricerca e Sviluppo delle imprese innovatrici italiane. Rivista di Politica Economica XCIV, V–VI:81–117 Aiello F, Cardamone P, Pupo V (2005) Produttività e Capitale Tecnologico nel Settore Manifatturiero Italiano. L’industria Rivista Di Economia e Politica Industriale N. 1:119–145 Antonioli B, Fazioli R, Filippini M (2000) Il servizio di igiene urbana italiano tra concorrenza e monopolio. In: Cambini C, e Bulckaen F (eds) I servizi di pubblica utilità. Concorrenza e regolazione nei nuovi mercati. Franco Angeli, Milano Arrow KJ (1962) The economic implications of learning-by-doing. Rev Econ Stat 29(1):155–173 Audretsch DB, Feldman MP (2004) Knowledge spillovers and the geography of innovation. In: Henderson V, Thisse JF (eds) Handbook of regional and urban economics, 1st edn 1, vol 4(4). Elsevier, Amsterdam Becker B, Pain N (2003) What determines industrial R&D expenditure in the UK? National Institute of Economic and Social Research Discussion paper, n 211 Berndt ER, Christensen LR (1973) The translog function and the substitution of equipment, structures, and labor in US manufacturing 1929–1968. J Econom 1:81–114 Bhattacharya M, Bloch H (2004) Determinants of innovation. Small Bus Econ 22:155–162 Breschi S, Lissoni (2001) Knowledge spillovers and local innovation system: a critical survey. Ind Corp Change 10:975–1005 Capitalia (2002) VIIIa Indagine sulle Imprese manifatturiere. Ottavo Rapporto sulle Industrie Italiane e sulla Politica Industriale. Ministero dell’Industria and Capitalia, Rome Capitalia (2005) IXa Indagine sulle Imprese manifatturiere. Nono Rapporto sulle Industrie Italiane e sulla Politica Industriale. Ministero dell’Industria and Capitalia, Rome Celikkol P, Stefanou SE (1999) Measuting the impact of price induced innovation on technological progress: application to the US food processing and distribution sector. J Product Anal 12:135–151 Christensen LR, Jorgenson DW, Lau LJ (1973) Trascendental logarithmic production frontiers. Rev Econ Stat 55:28–45 Ciccone A (2004) Human capital as a factor of growth and employment at the regional level: the case of Italy. Mimeo, Universitat Pompeu Fabra Cincera M (2005) Firms’ productivity growth and R&D Spillovers: an analysis of alternative technological proximity measures. CEPR Discussion paper, n 4984

R&D spillovers and firms’ performance in Italy

165

Cohen WM, Levinthal DA (1989) Innovation and learning: the two faces of R&D. Econ J 99:569–596 Cohen WM, Levinthal DA (1990) Absorptive capacity: a new perspective on learning and innovation. Adm Sci Q 5:128–152 Feser EJ (2004) A flexible test for agglomeration economies in two US manufacturing industries. CES Working Paper, n 14. Center for Economic Studies, US Census Bureau Goel D (2002) Impact of infrastructure on productivity: case of Indian registered manufacturing. CDP Working Paper, n 106. Centre for Development Economics, Delhi School of Economics Griliches Z (1979) Issues in assessing the contribution of R&D to productivity growth. Bell J Econ 10: 92–116 Griliches Z (1990) Patent statistics as economic indicators: a survey. J Econ Lit 28:1661–1707 Griliches Z (1991) The search for R&D spillovers. Scand J Econ 94:29–47 Guiso L, Schivardi F (2007) Spillovers in industrial districts. Econ J 117:68–93 Gustavsson P, Poldhal A (2003) Determinants of firms R&D: evidence from Swedish firm level data. FIEF Working Paper, n 190. Stockholm School of Economics and Trade Union Institute for Economic Research Hanhoff D (2000) R&D spillovers, technological proximity and productivity growth. Evidence from German panel data. Schmalenbach Bus Rev 52:238–260 Inkmann J, Pohlmeier W (1995) R&D spillovers, technological distance, and innovative success. Mimeo, University of Kostanz Jacobs J (1969) The economy of Cities. Random House, New York Jaffe AB (1986) Technology opportunity and spillovers of R&D: evidence from firms patents, profits, and market value. Am Econ Rev 76:984–1001 Jaffe AB (1988) Demand and supply influences in R&D intensity and productivity growth. Rev Econ Stat 70:431–437 Jaffe AB, Trajtenberg M, Henderson R (1993) Geographic localization of knowledge spillovers as evidenced by patent citations. Q J Econ 108(2):577–598 Kaiser U (2002) Measuring knowledge spillovers in manufacturing and services: an empirical assessment of alternative approaches. Res Policy 31:125–144 Koo J (2005) Technology spillovers, agglomeration, and regional economic development. J Plann Lit 20:99– 115 Lall S, Shalizi Z, Deichmann U (2001) Agglomeration economies and productivity in Indian industry. Policy Research Working paper, n. 2663. The World Bank Development Research Group Leo H (2003) Determinants of innovative activities at the firm level. Paper presented to the International Workshop “Empirical Studies on Innovation in Europe. 1–2 December 2003, Faculty of Economics, University of Urbino, Italy Los B, Verspagen B (2000) R&D spillovers and productivity: evidence from US manufacturing microdata. Empir Econ Rev 25:127–148 Lu WC, Chen JR, Wang CL (2005) R&D, spatial spillovers and productivity growth: evidence from dynamic panel. Mimeo, National Centre University, Taiwan Lucas RE (1988) On the mechanics of economic development. J Monet Econ 22(1):3–42 Marshall A (1920) Principles of economics. MacMillan Press Ltd, London May JD, Denny M (1979) Factor augmenting technical progress and productivity in US manufacturing. Int Econ Rev 20:759–774 Medda G, Piga C (2004) R&S e Spillovers industriali: Un’analisi sulle imprese italiane. Crenos Working Paper, n. 2004/2006 Morishima M (1967) A few suggestions on the theory of elasticity. Keizai Hyoron (16):144–150 Orlando M (2000) On the importance of geographic and technological proximity for R&D spillovers: an empirical investigation. RWP 00–02, Federal Reserve Bank of Kansas City Raut LK (1995) R&D spillovers and productivity growth: evidence from Indian private firms. J Dev Econ 48:1–23 Romer PM (1986) Increasing returns and long-run growth. J Polit Econ 94:1002–1037 Verspagen B (1995) R&D and productivity: a broad cross-section cross-Country look. J Product Anal 6:117–135 Vinding AL (2006) Absorptive capacity and innovative performance: a human capital approach. Econ Innov New Technol 15(4/5):507–517 Wakelin K (2001) Productivity growth and R&D expenditure in UK manufacturing firms. Res Policy 30:1079–1090

166

F. Aiello, P. Cardamone

Wang Y (2007) Trade, human capital, and technology spillovers: an industry-level analysis. Rev Int Econ 15(2):269–283 Wieser R (2005) Research and development productivity and spillovers: empirical evidence at the firm level. J Econ Surv 19:587–621 Wooldridge JM (2002) Econometric analysis of cross section and panel data. MIT, Cambridge

The impact of decentralization and inter-territorial interactions on Spanish health expenditure Joan Costa-Font · Francesco Moscone

Abstract This paper examines the determinants of regional public health expenditure in a decentralised health system. Unlike previous studies we take into account possible policy and political interactions among authorities, as well as unobserved heterogeneity. Our emprirical contribution lies in running a spatial panel specification using a dataset of all Spanish region states on aggregated and disaggregated health expenditures (pharmaceuticals, inpatient and primary care). Results are consistent with some degree of interdependence between neighboring regions in spending decisions. Empirical evidence of long term efficiency effects of health care decentralisation, suggests that a specific spatial-institutional design might improve the health system efficiency as well as regional cohesion. Political and scale effects are consistent with theoretical predictions. Keywords Health expenditure · Decentralisation · Spatial econometrics · Panels JEL Classification I18 · I38 · C31 · C33

J. Costa-Font (B) European Institute, LSE Health and Social Care, London School of Economics, London, UK e-mail: [email protected] J. Costa-Font CAEPS & Departament de Teoria Econòmica, Universitat de Barcelona, Barcelona, Catalonia, Spain F. Moscone Department of Economics and Girton College, University of Leicester, Leicester, UK F. Moscone Department of Economics and Girton College, University of Cambridge, Cambridge, UK

168

J. Costa-Font, F. Moscone

1 Introduction The importance of decentralisation in affecting public sector performance and outputs has been increasingly recognized. This is particularly relevant for those expenditure sources that have a sizeable impact on human welfare and that are publicly financed (e.g. health and social care). Programmes of fiscal and political decentralisation are progressively expanding in most countries due to their potential economic and political benefits. Decentralisation is put forward as a means of taking advantage of local values and needs, ultimately improving government responsiveness. Theories of fiscal federalism postulate that government political decentralisation is an efficiency enhancing territorial tool. More importantly, it is argued that decentralisation can make governments more accountable to citizens. By fostering jurisdictional competition, a decentralised public sector might give rise to foreseeable welfare benefits (Besley and Case 1995; Revelli 2002, 2006). The latter is especially relevant in highly visible policy areas such as health care. The effects of decentralisation on health spending have been investigated in the US (Skinner and Wennberg 2000), Canada (Di Matteo and Di Matteo 1998), Switzerland (Crivelli et al. 2006), Spain and Italy (Costa-Font and Pons-Novell 2007; Giannoni and Hittris 2002). An important issue that guides the institutional design of a health system, particularly if it is regionally decentralised, is that of inequalities in output. Countries that legally ensure universal access to health care are committed to delivering health care independent of the ability to pay. Accordingly, the influence of income on health expenditure is regarded as a potential source of regional inequality. The elasticity of income is also important for its implications on resource distribution. A number of cross-country studies suggests health care to be a luxury good rather than a normal good (Newhouse 1977), though when regional income is taken into account; elasticities drop significantly and the hypothesis of health care being a luxury good does not always hold (Di Matteo and Di Matteo 1998; Giannoni and Hittris 2002). A further issue that remains unexplored is the potential long-term effects of decentralisation. This refers to a “learning by doing” process that could take place some time after the implementation of a new decentralised institutional setting. The ideology of regional incumbents (invested with political responsibilities), does not seem to be a clear explanation for regional spending variation. One might argue that only left wing regional incumbents of relatively rich regions have incentives to increase expenditure, given that they face the competition of a relatively more developed private health care sector. Along with these important questions there is the potential influence of spatial interaction among regions when allocating resources to health care programs. A decentralised setting could give rise to local competitive interactions thus explaining (in part) variations in health care expenditure. Only recently a new strand of health economics has explored the influence of spatial effects on health outcomes and expenditure using either cross-section (Moscone and Knapp 2005) or pooled data (Revelli 2006; Costa-Font and Pons-Novell 2007; Moscone et al. 2007b). However, the panel nature of regional data has, with one exception, not been exploited in the arena of health care (Moscone et al. 2007a).

The impact of decentralization and inter-territorial interactions

169

The decentralized Spanish health care system appears as a clear-cut candidate within which to examine these issues. Spain exhibits significant regional heterogeneity in needs and preferences (Rico and Costa-Font 2006; Costa-Font and Rico 2006). Two major features have defined health care reform in Spain. On the one hand, the consolidation of the National Health System (NHS) was largely politicised. Hence, in examining regional expenditure data we would expect significant politically driven effects. On the other hand, the gradual process of health care decentralisation from the early eighties until 2002, has given rise to competitive spatial interactions among region states (Costa-Font and Rico 2006). This paper draws upon a panel of Spanish regions to examine the following issues. First, we investigate the effects of the inception and experience of decentralised political or fiscal health care institutions on health expenditure. Second, we attempt to disentangle the effect of the political ideology or regional incumbents on expenditure, as well as its association with regional income, ultimately determining the influence of private sector development. Third, we test whether there does in fact exist a certain degree of interdependence between neighbouring municipalities in spending decisions. Fourth, we control for regional income as well as health care activity inputs such as doctors and beds, population scale effects, demand and health need influences such as the regional demographic composition. Finally, we examine the influence of such effects over different types of expenditure (total, drug, inpatient and outpatient) to account for heterogeneity in different spending categories. The paper is structured as follows. Section 2 introduces the institutional setting. Section 3 reviews the previous literature on the health care determinants and contains the research questions of the study. Section 4 describes the data and methods. Section 5 discusses the empirical results, while Sect. 6 closes with some concluding remarks.

2 The institutional setting The Spanish National Health System (NHS) finances health care by funds raised through general taxation with user co-payments having a markedly restricted role. Health care expenditure accounts for 7.5% of GDP with approximately 5.5% corresponding to public expenditure and 2.1% to private expenditure. The population has the right to free access to services. Benefits are comprehensive and cover all types of care except for long-term care and dental services with the exception of some region states. Funds are centrally collected with the exception of Navarra and the Basque Country. Once the Parliament determines the amount of health care expenditure in the National General Budget, resources are then allocated to regions by means of a block central grant following an unadjusted capitation formula. Since 2002, the Spanish population receives health care services from their own region, legally named as Autonomous Communities (ACs). The ACs are also responsible for health care planning, organization and management, and thus are politically accountable to their constituents. Before 2002, only some ACs had health care responsibilities, while most of them were centrally managed. Catalonia obtained health care responsibilities in 1981, followed by Andalucia (1984), the Basque Country and Valencia (1988),

170

J. Costa-Font, F. Moscone

Galicia and Navarre (1991), the Canary Islands (1994) and from 2002 the rest of ACs followed. 3 Research questions 3.1 Institutional effects: decentralisation and experience Decentralisation can take place by transferring fiscal and political responsibilities to newly created (junior) institutions. Within Spain, this has been the case of Navarra and the Basque Country, since can raise their own taxes. Therefore, if health care is a high policy priority one might expect fiscal accountability to increase expenditure. As for political decentralisation, during the period examined, this included as well Catalonia, Valencia, Galicia, Canarias and Andalucia. At first, political decentralisation leads to initial sunk costs, though some long run experience effects resulting from management experience could arise as well. It is not clear whether the mechanisms of vertical and horizontal competition between regional health systems lead to efficiency of public policy making. In countries where multiple region-states provide health care, one can argue that region-specific preferences and needs can be taken into account in the allocation of resources. In this paper we are interested in testing whether decentralisation experience and fiscal accountability affect expenditure. 3.2 Income effects and regional inequalities There has been a long-lasting discussion on whether or not health care is a luxury good (Newhouse 1992, 1977). Interestingly, Okunade and Murthy (2002) find that income exhibits a stable positive relationship with per capita health care expenditure. However, the influence of income at the regional level necessitates the examination of regional inequalities. Indeed richer regions should be expected to pay higher taxes and exhibit higher expenditure. Conversely, the richer a region is, the more likely the private sector is used to cover unsatisfied demands of the public sector thus leading to opposite effects. Furthermore, in Spain, budgeted health expenditure is allocated to regions on a population basis without taking into consideration income though then regions might well overspend and rely on debt depending on their expenditure priorities (Lopez-Casasnovas et al. 2005). Hence, the hypothesis to test here is the extent to which income explains the allocation of public expenditure; this might point to evidence of the presence of some regional inequalities. 3.3 Political effects As in other OECD countries, the size of the public health care sector is determined by the political priorities of the incumbent parties running the health system (Parkin et al. 1987). In principle, we would expect left wing governments to increase public health care expenditures at a faster rate than right-wing governments. Parties of the

The impact of decentralization and inter-territorial interactions

171

left may favour spending on social welfare (Henrekson 1988). However, recent evidence indicates that the left gains credibility through expenditure cuts, while the right gains credibility through tax revenue increases (Tavares 2004). Finally, in the health care arena, the existence of a private sector in relatively more affluent regions might exert additional pressures on political incumbents. Whilst right-wing incumbents are generally expected to opt for private sector alternatives, left wing incumbents would expand public expenditure. In this paper we test the influence of political affiliation, and its interaction with income on health expenditure. 3.4 Preference, health care inputs and heterogeneity Previous evidence using decomposition analysis of health care expenditure data suggests that after the 1990s, volume, rather than price, became the main determinant of health care expenditure (Lopez-Casasnovas et al. 2005). Thus, the ageing process and the relatively higher coverage for certain treatments, such as drugs for the elderly, might play an important role in explaining spending variation. Similarly, changes in utilization patterns might result from differences in supply inducement incentives by doctors in treatment intensity and differences in the types of inputs chosen. For instance, those regions that exhibit high levels of physician density tend to display lower levels of in-patient care due to some substitution taking place (Skinner and Wennberg 2000). Furthermore, the population size of the system is a well known determinant thus larger regional health services are likely to exhibit economies of scale in the provision of health care. Finally, given that coverage of certain health care programs varies across the various types of expenditure, we would expect different coefficients across expenditure sources. For instance, policies to cut down pharmaceutical expenditure by promoting generic drugs are more prevalent in some region states (including Andaluicia, Catalonia and Navarra). Accordingly, we test these research questions empirically. 4 Spatial effects in public and health expenditure Increasing evidence for the need to control for spatial structures is found across a variety of health conditions. For instance, there is evidence of spatial autocorrelation in mortality (Lorant et al. 2001), as well as in morbidity, including child leukemia (see Alexander 1993), childhood cancer (Gatrell and Whitelegg 1993), and asthma (Hsiao 2000). Some studies find spatial autocorrelation of cancer mortality patterns (Thouez et al. 1997). In decentralised government structures states and organisations compete with each other for health care resources or for the concentration of a certain quality of health care if there is a common central distribution. For instance, Moscone and Knapp (2005) identify a number of potential sources of spatial interaction in the local organisation of health care including demonstrative and mimicking effects. Furthermore, strategic interaction might take place among regional governments in setting of taxes and expenditures so that some welfare competition can take place (Costa-Font and Pons-Novell 2007). Citizens of one jurisdiction might benchmark the benefits

172

J. Costa-Font, F. Moscone

Table 1 Expected results Effects

Variable

Effect on expenditure

Spatial effects

Spatial lag

Positive

Institutional

Political and fiscal decentralisation

Positive and negative

Income

Per capita GDP

Positive

Political interactions

Left wing–left wing income interactions

Positive effects

Scale effects

Population

Negative

Demand and preference

Aging

Positive

Supply inducement

Beds and doctors concentration

Positive

Coverage

Aging for drug expenditure

Positive

levels and successful programs offered by neighbouring jurisdictions when judging their own jurisdiction’s performance. Other forms of interactions might result from so-called welfare migration (Brueckner 2000). Migration welfare generosity leads to tax increases in more generous regions to fund new recipients of welfare. However in a setting such as that of Spain (Costa-Font and Rico 2006) where welfare migration is rather uncommon then a separate equilibrium can take place where regional incumbents might have incentives to increase coverage. When coordination by the central state is weak, there are incentives for regional incumbents to compete with the central state (Besley and Case 1995). In the Spanish case, we would expect some strategic interaction whereby the welfare coverage of some AC’s is likely to depend on the coverage in neighbouring regions. Therefore, there might be cross-section dependence in region-specific data. Table 1 contains a summary of the expected effects described in this section.

5 Data and methods We collected data on Spanish health care expenditure at a regional level, from the Ministry of Health and Consumption for the years 1995–2002 (Cuentas Satelites del Sistema Sanitario, 1995—2002), and complementary statistical information (GDP, population and inflation rates) from Contabilidad Regional de España. Information on the number of doctors and health professionals, beds and occupancy rates has been gathered from the National Institute of Statistics (INE). Data on electoral results have been collected from the elelweb.web page, that contains information on electoral trends in Spain. We also note that during the period from 1995 to 2002 the devolution process was completed. Such devolution was developed asymmetrically, so that while seven ACs were entitled with health care responsibilities the other ten ACs were centrally ruled by the MoH, though a specific agency so called INSALUD. To examine the determinants of health expenditure, we propose a panel data model extended to incorporate possible interaction among regions, where the value of the dependent variable for one authority is simultaneously determined with that of neighbouring regions. Given N municipalities observed over T time periods, we assume that

The impact of decentralization and inter-territorial interactions

173

public per capita expenditure observed in region ith at time t, namely yit , is generated according to the following panel

yit = ρ

N

wi j y jt + β ′ xit + eit ,

(1)

j=1

where xit is a k × 1 vector of regressors, eit is the error term, and wi j is the generic element of a positive, N × N matrix W, known as the spatial weights matrix. In a spatial weights matrix the rows and columns correspond to the cross-section observations, and wi j can be interpreted as the strength of potential interaction between units i and j (Anselin 1988; Arbia 2006). The specification of W is in general arbitrary, based on some measures of distance between units. In our empirical model we decided to approximate weights wi j using information on the contiguity among Spanish regions, and assigning wi j = 1 when region state i and j share a common border or vertex, and wi j = 0 otherwise. Following most applied literature on spatial econometrics, we row standardized the spatial weights matrix. The introduction of a spatial lag in the model allowed us to correct for potential spatial dependence. We note that the presence of spatial autocorrelation has important consequences on some of the inferences obtained using a classical econometric methodology, and may indeed invalidate them. Following previous studies we have developed a panel data framework where xit are: GDP at the AC level, prices indices at the AC level and population. Furthermore, our model contains information on the potential demand for health care through a variable proxing ageing process, and information on supply availability and performance such as the number of doctors, beds, and occupancy rate. Our model includes a variable measuring the ideology of the regional incumbent (and national incumbent when applies for those AC managed by INSALUD). Indeed, health expenditure might well result from differences in needs which are measured through the percentage of over 75 as well as differences in the availability of doctors (doc) and beds (bed). The number of doctors might lead to some supply induced demand and the number of beds might lead to an expansion of patients treated to justify a certain capacity (Davis et al. 2000). The random effects model with spatially lagged dependent variable, expressed in stacked form as T successive cross-sections, is

yit = ρ

N

wi j y jt + β ′ xit + eit ,

(2)

j=1

eit = µi + εit .

(3)

where εit are IID random variables, and µi is a random effect associated municipality i, IID distributed with zero mean and variance σµ2 . The random effects specification with spatial error correlation is (Baltagi et al. 2003)

174

J. Costa-Font, F. Moscone

yit = ρ

N

wi j y jt + β ′ xit + eit ,

(4)

j=1

eit = µi + vit , vit = λ

N

wi j vit + εit ,

(5) (6)

j=1

In this case the error term is the orthogonal sum of two components, a time-invariant, municipality-specific random disturbance, and a spatial process. A critical assumption in the random effects specifications is that E(εit µi ) = 0, and E(µi xit ) = 0, for i = 1, . . . , N, and t = 1, . . . , T, therefore, the individual-specific component is hypothesized to be orthogonal to the explanatory variables (Hsiao 2003). If this hypothesis does not hold, estimates from the random effects model might suffer from possible bias due to the correlation between the error term and the regressors. 5.1 Results The model is estimated using a log–log form allowing the coefficients to be interpreted as elasticities. A classical regression model was run to determine the extent to which spending variations could be explained by variations in need for services. However, since part of the variation in spending could be explained by the interaction among regions, a spatial autoregressive model was specified. The results from the spatial model can then be compared with those from a classical (non-spatial) model, as shown in Tables 3, 4, 5, and 6. All econometric analyses were conducted using Stata and Matlab. We distinguish between total expenditure and three different sources of expenditure, including pharmaceutical expenditure, inpatient expenditure, and outpatient expenditure. Considering only the entire aggregate of health spending may result in a reduction of spatial correlation due to the product of different effects in spending categories that overall cancel out (Baicker 2005; Moscone et al. 2007a). 5.2 Classic OLS regressions The classical model shows an R 2 of 0.93 for total expenditure, 0.89 for pharmaceutical expenditure, 0.72 for inpatient expenditure, and 0.56 for outpatient expenditure, all indicating a good fit. As some of the econometric techniques to follow are based on the assumption of normality, we also calculated the Jarque-Bera (JB) test for normality of errors for each spending category (see Table 2). Using the conventional 95% level of significance, the null hypothesis of normality is not rejected, with the only exception of primary care spending. As for the estimated regression coefficients, for total expenditure they are highly significant and have the expected signs. The remaining spending categories show a limited set of significant variables, which all have expected signs. Further, the F test is 98.57 for total expenditure, and 55.49, 17.42, 8.96 for pharmaceutical, inpatient, and primary care, respectively. These results lead us to

The impact of decentralization and inter-territorial interactions

175

Table 2 Descriptive statistics and tests of normality on the selected variables (years 1995–2002) Mean

Std. error

Min.

Max.

Per capita total exp. (euro)

695.79

117.18

468.28

990.36

Per capita pharm. exp. (euro)

149.80

32.42

87.02

245.16

Per capita inpatient exp. (euro)

367.67

67.56

238.83

542.12

Per capita primary care exp. (euro)

119.94

80.86

65.01

515.83

Population

2,370,794

2,062,899

268,663

7,478,432

Decentralised

0.4117

0.4939

0

1

Time from decentralisation (years)

4.2647

5.9494

0

21

Per capita GDP

20,829

21,462

1,595

77,484 64

Left wing party (%)

40.66

11.62

17

Doctors per 100,000 pop.

10,253

9,012

1,048

29,996

Pop. 64–75 (%)

9.40

1.33

6.27

12.22

Fiscal responsibility

0.1176

0.3233

0

1

Beds per 100,000 pop.

2.1416

0.5377

1.1700

3.1700

No. of obs.

DF

Jarque-Bera

Prob.

Public per capita exp.

136

2

0.354

0.8375

Per capita pharm. exp.

136

2

1.671

0.4300

Per capita inpatient and spec. exp.

136

2

1.576

0.4546

Per capita primary care exp.

136

2

86.477

0.0000

N = 136

conclude that overall regressors have a significant effect on the dependent variable for all models. Furthermore, the Variance Inflation Factor examined was lower than 15 indicating that there is no problem of multicollinearity in our work.

5.3 Model selection Traditional multivariate regression models do not take into account the potential spatial structure of data, which if ignored could lead to biased and inconsistent estimates (Anselin 1988). Thus, allowing for spatial dependence in the regression model should lead to more reliable inference. Our specification strategy is primarily based on two robust Lagrange multiplier (LM) tests, the LM for spatially autoregressive errors and the LM for a spatial lag, according to the procedure suggested by Florax et al. (2003). For total expenditure, the LM test statistic for the model with the spatially lagged dependent variable, equal to 42.52, is the only significant at the 5% level. Similarly, for the other categories of spending the LM test statistics for the model with the spatially lagged dependent variable are the only significant at the 5% level, and are equal to 13.39 for pharmaceutical expenditure, 10.20 for inpatient expenditure, and 7.93 for primary care expenditure. This leads us to conclude in favour of the spatial lag model.

176

J. Costa-Font, F. Moscone

Table 3 Regression models for per capita total expenditure Variables

OLS coeff.

Spatial lag coeff.

Classic RE coeff.

Spatial RE (ML) coeff.

Spatial RE (IV) coeff.

Population

−0.0460

−0.0425

−0.0358

−0.0401

−0.0378

(0.016)

(0.011)

(0.0154)

(0.0221)

(0.0134)

−0.0140

−0.0098

−0.0065

−0.005

−0.0104

(0.002)

(0.0016)

(0.002)

(0.0017)

(0.0021)

Time decentralisation Decentralisation Per capita GDP % Left wing party GDP*left wing party Doctors per 100,000 pop. Pop. 64–75 Fiscal responsibility Beds per 100,000 pop. Intercept Spatial lag

0.2380

0.1643

0.1259

0.0939

0.1962

(0.029)

(0.021)

(0.0317)

(0.0311)

(0.0289)

0.0740

0.0584

0.015

−0.0005

−0.0946

(0.014)

(0.0108)

(0.0189)

(0.0169)

(0.0204)

−0.6380

−0.4694

−0.1306

−0.0366

−0.8561

(0.12)

(0.0943)

(0.1546)

(0.1477)

(0.1796)

0.0750

0.0542

−0.0126

−0.073

0.0928

(0.013)

(0.0102)

(0.0169)

(0.0164)

(0.0194)

0.0560

0.0447

0.0275

0.0246

0.0318

(0.015)

(0.0102)

(0.0079)

(0.0066)

(0.0097)

0.3170

0.1989

0.1283

0.0732

0.1691

(0.038)

(0.0312)

(0.04)

(0.0355)

(0.0443)

0.0490

0.0577

−0.0943

0.0969

0.0218

(0.018)

(0.0138)

(0.0335)

(0.0371)

(0.0273)

0.0430

0.077

−0.0026

−0.0004

0.0262

(0.019)

(0.0152)

(0.01559)

(0.0133)

(0.0188)

6.9690

6.715

7.1332

6.902

6.4226

(0.179)

(0.1497)

(0.2962)

(22.26)

(0.2743)

0.0199

–

0.0218

0.0100

–

(0.0025) LIK

236.04

R2

0.9342

F stat

98.57

261.47

301.10

(0.0066)

(0.0037)

303.03

–

Temporal dummy variables were included in all regressions. Numbers in parentheses indicate standard errors

5.4 Estimation results In this paper we consider both maximum likelihood (ML) and instrumental variable (IV) approaches for the estimation of the model with the spatially lagged dependent variable. The second column of Tables 3, 4, 5, and 6 shows results from the ML estimation of the model including the spatially lagged dependent variable. The likelihood-based measure (LIK) can be used to compare the fit of the spatial lag with the ordinary regression model. It turns out that the fit improves when the spatial lag is added to the model, as indicated by an increase in the log-likelihood (from 236.04 for OLS

The impact of decentralization and inter-territorial interactions

177

Table 4 Regression models for per capita pharmaceutical expenditure Variables

OLS coeff.

Spatial lag coeff.

Classic RE coeff.

Spatial RE (ML) coeff.

Spatial RE (IV) coeff.

Population

−0.0100

−0.0070

−0.0430

−0.0291

−0.0452

(0.0241)

(0.0213)

(0.0416)

(0.0171)

(0.0362)

0.0028

0.0008

0.0013

0.0016

−0.0007

(0.0034)

(0.0031)

(0.0025)

(0.0024)

(0.0029) 0.2262

Time decentralisation Decentralisation Per capita GDP % Left wing party GDP*left wing party Doctors per 100,000 pop. Pop. 64–75 Fiscal responsibility Beds per 100,000 pop. Intercept Spatial lag

0.2150

0.1515

0.1516

0.1083

(0.0421)

(0.0408)

(0.0786)

(0.0536)

(0.0708)

0.0594

0.0476

0.0051

0.0066

0.1315

(0.0235)

(0.0210)

(0.0347)

(0.0200)

(0.0617)

−0.4354

−0.3030

0.1076

0.1330

−1.1484

(0.2031)

(0.1831)

(0.2137)

(0.1813)

(0.5357)

0.0541

0.0376

−0.0132

−0.0162

0.1230

(0.0219)

(0.0199)

(0.0235)

(0.0200)

(0.0583)

−0.0100

−0.0196

−0.0117

−0.0122

−0.0141

(0.0223)

(0.0199)

(0.0093)

(0.0087)

(0.0107)

0.5015

0.3973

0.1684

0.1607

0.1628

(0.0610)

(0.0606)

(0.0529)

(0.0496)

(0.0605)

−0.1959

−0.1857

−0.1868

−0.1652

−0.3255

(0.0302)

(0.0268)

(0.1042)

(0.0671)

(0.0936)

−0.0568

−0.0268

−0.0238

−0.0211

−0.0107

(0.0321)

(0.0294)

(0.0195)

(0.0183)

(0.0227)

6.1628

5.7123

5.8291

0.0437

4.9213

(0.3202)

(0.2905)

(0.7412)

(0.0153)

(0.8295)

–

0.0246

−

0.0230

0.0360

(0.0032)

(0.0111)

(0.0064) LIK

164.1300

R2

0.8888

F stat

55.49

171.2700

257.0215

260.2806

Temporal dummy variables were included in all regressions. Numbers in parentheses indicate standard errors

to 261.47 for the spatial lag model, from 164.13 to 171.270, from 124.02 to 129.27, from −10.31 to −6.82 for total, pharmaceutical, inpatient and outpatient spending, respectively). If we look at total expenditure, the improved fit is as expected, since the spatial lag coefficient turns out to be significant. Estimation of the model yields a positive value for the spatial effect (0.019) with a p value of 0.00, suggesting a potential local interaction as well as policy interdependence among regions. Similarly, pharmaceutical, inpatient and outpatient spending show a positive and significant spatial effect. If we still focus on total spending, compared to the OLS results, almost all the estimated parameters, such as “Pop. 64–75”, and “Left wing party” have decreased relatively in absolute value. These alterations in the regression coefficients could be

178

J. Costa-Font, F. Moscone

Table 5 Regression models for per capita inpatient expenditure Variables

OLS coeff.

Spatial lag coeff.

Classic RE coeff.

Spatial RE (ML) coeff.

Spatial RE (IV) coeff.

Population

−0.0470

−0.0423

0.0061

0.0019

0.0001

(0.0323)

(0.0290)

(0.0475)

(0.0221)

(0.0441)

−0.0092

−0.0048

−0.0024

0.0023

0.0024

(0.0045)

(0.0043)

(0.0035)

(0.0032)

(0.0037)

0.1229

0.0470

0.0672

0.0665

−0.0505

(0.0566)

(0.0554)

(0.0906)

(0.0688)

(0.0866)

Time decentralisation Decentralisation Per capita GDP % Left wing party GDP*left wing party Doctors per 100,000 pop.

0.0133

−0.0023

−0.0394

0.0360

−0.0147

(0.0315)

(0.0286)

(0.0433)

(0.0274)

(0.0757)

−0.0617

0.1087

0.3397

0.3443

0.1069

(0.2728)

(0.2491)

(0.2936)

(0.2508)

(0.6580)

0.0130

−0.0080

−0.0442

0.0445

−0.0190

(0.0294)

(0.0270)

(0.0323)

(0.0277)

(0.0716)

0.0634

0.0519

0.0118

0.0122

0.0109

(0.0299)

(0.0271)

(0.0130)

(0.0123)

(0.0134)

0.2395

0.1105

0.0371

0.0306

0.0327

(0.0819)

(0.0825)

(0.0731)

(0.0678)

(0.0757)

0.2205

0.2309

0.2756

0.2639

0.2425

(0.0406)

(0.0365)

(0.1165)

(0.0852)

(0.1139)

0.0172

0.0536

0.0172

0.0191

0.0169

(0.0431)

(0.0402)

(0.0272)

(0.0257)

(0.0284)

6.6465

6.1510

6.2437

6.0944

6.0573

(0.4301)

(0.3971)

(0.8655)

(0.0173)

(1.0137)

Spatial lag

–

0.0247

–

0.0192

0.0063

(0.0168)

(0.0136)

LIK

124.02

129.27

214.83

215.28

R2

0.7151

F stat

17.42

Pop. 64–75 Fiscal responsibility Beds per 100,000 pop. Intercept

(0.0074)

Temporal dummy variables were included in all regressions. Numbers in parentheses indicate standard errors

explained, with reference at the beginning of this section, by a marked spatial pattern of expenditure. The same can be observed across different spending categories. One drawback of Eq. (1) is that it assumes that the relationship between determinants and expenditure is homogeneous across regions, an assumption which is restrictive and unlikely to hold in this study. We suspect the existence of potential unobserved variability, that if not properly incorporated in the model, may generate incorrect conclusions of spatial correlation (McMillen 2003). This leads us to the estimation of a random effects panel data model, extended to include a spatially lagged dependent variable. The random effects specification allows us to capture timeinvariant heterogeneity across political units through an individual authority-specific

The impact of decentralization and inter-territorial interactions

179

Table 6 Regression models for per capita primary care expenditure Variables

OLS coeff.

Spatial lag coeff.

Classic RE coeff.

Spatial RE (ML) coeff.

Spatial RE (IV) coeff.

Population

−0.1780

−0.1705

−0.1196

−0.1077

−0.0824

(0.0867)

(0.0788)

(0.1167)

(0.0295)

(0.1275)

0.0018

0.0122

0.0213

0.0208

0.0182

(0.0122)

(0.0117)

(0.0030)

(0.0029)

(0.0041)

0.1458

−0.0231

−0.2272

−0.2537

−0.1967

(0.1519)

(0.1521)

(0.2393)

(0.1850)

(0.2451)

Time decentralisation Decentralisation Per capita GDP % Left wing party GDP*left wing party Doctors per 100,000 pop. Pop. 64–75 Fiscal responsibility Beds per 100,000 pop.

0.7078

0.6737

0.2084

0.2087

0.3713

(0.0846)

(0.0775)

(0.0821)

(0.0563)

(0.1348)

−5.1446

−4.7297

−0.1560

−0.1440

−2.1344

(0.7324)

(0.6778)

(0.2633)

(0.2489)

(1.1239)

0.5768

0.5259

0.0093

0.0080

0.2246

(0.0789)

(0.0734)

(0.0290)

(0.0274)

(0.1227)

0.1391

0.1136

0.0381

0.0370

0.0356

(0.0804)

(0.0734)

(0.0111)

(0.0105)

(0.0137)

0.4640

0.1538

−0.1540

−0.1540

−0.1213

(0.2198)

(0.2241)

(0.0646)

(0.0606)

(0.0807)

−0.1467

−0.1325

0.1599

0.1630

0.0438

(0.1090)

(0.0992)

(0.3393)

(0.2683)

(0.3483)

−0.0915

−0.0125

−0.0037

−0.0037

0.0119

(0.1157)

(0.1102)

(0.0235)

(0.0223)

(0.0298)

0.7985

0.0691

3.5828

3.3150

1.6484

(1.1549)

(0.0250)

(2.0482)

(0.5776)

(2.5287)

Spatial lag

–

0.0691

–

0.0502

0.0312

(0.0491)

(0.0407)

LIK

−10.31

−6.82

129.13

215.47

R2

0.5634

F stat

8.96

Intercept

(0.0250)

Temporal dummy variables were included in all regressions. Numbers in parentheses indicate standard errors

error component, and thus may achieve a gain in information and efficiency when compared to a pooled regression (Hsiao 2003). Columns 3 and 4 for Tables 3, 4, 5, and 6 contain maximum likelihood estimation of the classical random and spatial lagged random effects [Eqs. (2)–(3)]. The model with the spatially lagged dependent variable achieves a significant increase in the likelihood when compared to its classical counterpart and the pooled estimations (from 301.10 to 303.03, from 257.02 to 260.28, from 214.83 to 215.28, from 129.13 to 215.47 for total, pharmaceutical, inpatient and outpatient spending respectively). Among the different possible ways to model unobserved heterogeneity, we selected the random effects model on the basis of a Hausman test, which has a value of 8.76 (with

180

J. Costa-Font, F. Moscone

a p value of 0.07) for total health spending, 20.19 (p value 0.16) for pharmaceutical spending, 4.98 (p value 0.99) for impatients, and 10.31 (p value 0.06) for primary care spending. A further issue that needs to be addressed is the problem of endogeneity in some of the regressors. We believe that a source of endogeneity in our study is the political orientation of Spanish regions (Navarro et al. 2006). We dealt with the problem of endogeneity by using an instrumental variable approach, that tries to get rid of endogeneity in the variable “Left wing” and in the spatial effect. As an instrument for political affiliation of each region state we have used the republican border during Spanish Civil war. Indeed, regions that were faithful to the republic are now more likely to be governed by left wing or regional nationalists. At the same time this variable is not related to health expenditure, thus, can be considered exogenous in our model. If we focus on aggregate expenditure (column 5 of Table 3), nine coefficients out of eleven are highly significant and have the expected signs. The effect of population size, which shows a negative coefficient implies that larger regions in terms of population size are likely to exhibit economies of scale in the provision of health care. However, the coefficient for population for different typologies of expenditure turns out not to be significant. Political decentralisation appears to increase total expenditure when new region states are set up from scratch—as has been the Spanish case—given that there are significant sunk costs when designing a decentralised provision of health care. However, after a recognisable number of years efficiency effects come into place progressively when time with decentralised responsibilities in controlled for in the empirical specification. Therefore, unlike previous studies our findings suggest that some efficiency in the form of cost savings could be achieved from decentralisation in the long run. Similarly, the fiscal decentralization seems to have a positive influence on pharmaceutical spending, though time of decentralization is not significant which is consistent with the fact that drug prices are determined at the central level. Vice versa, while decentralization seems not to have an impact on outpatient spending, time of decentralization is not significant. Both these factors do not seem to play a role, ceteris paribus, in explaining inpatient spending which is ruled by subsequent progressive decentralisation of hospital management. Probably one of the most striking results for our study is that after these controls, regional income displays significant and negative effect at the aggregate level that differs from the luxury good evidence of cross-country studies. Examining the effect of regional income at the specific expenditure type level we find that it is not significant for inpatients whilst it is significant for primary care, most likely because richer regions manage to strengthen their primary care networks to cover all the potential population. Another important effect is that of political distance. Indeed, political effects indicate that ideology alone displays a counter-intuitive effect resulting from the fact that left wing governments are less likely to be contested when they cut total expenditure, though some studies obtain similar evidence or counter-intuitive effects (Tavares 2004). However, when an interaction between income and ideology is added, it provides positive and significant effects indicating that given a certain income level, ideology does have an effect suggesting that political effects require a certain degree of income. This suggests that relatively richer regions with left wing governments are more likely to expand health expenditure further. An interpretation may lie in the well

The impact of decentralization and inter-territorial interactions

181

known interactions between public and private sectors. Indeed, in relatively richer region-states the private sector might develop to counteract the pitfalls of the national health system. Yet, left wing governments increase expenditure to maintain the support of middle classes to the NHS as a supply-induced demand model would predict (Davis et al. 2000). When examining the effect on different expenditure sources we find that political ideology seems to influence only pharmaceutical and outpatient expenditure. The former is the only responsibility that has remained at the central level. Conversely, we find that this interaction effect is significant in explaining primary care expenditure, which is a regional responsibility. A larger availability of inputs (number of doctors per 100,000 pop.) raises total health care expenditure. The variable “% Pop. 64–75” stands as an indicator of health related need (explaining a higher demand for health care). In examining the specific effect of this variable at the expenditure type we find that ageing seems to influence pharmaceutical expenditure. Yet this has to do with the specific design of pharmaceutical cost sharing whereby the retired population is entitled to free dugs rather than a pure demand effect. In addition, aging seems to reduce primary care expenditure possibly due to the existence of substitution effect with other health inputs such as pharmaceuticals. Interestingly, we have identified no significant effect of supply nor demand as influencing inpatient expenditure whilst as expected a higher doctor concentration leads to higher primary care expenditure. The number of beds seems not to be significant, which indicates that structure and supply factors are not explaining expenditure variability. One possible explanation lies in the potential substitution between different supply factors in hand as well as some mimicking of policy among region states. Arguably, an important variable to explain expenditure from a fiscal federalism standpoint is that of fiscal accountability. Fiscally accountable regions, in the context of being health care higher priority for citizens displays a positive coefficient, though not significant after all the set of controls introduced. Yet, when this variable is examined for expenditure type, we find that it increases expenditures in those areas where regional government have some responsibilities, that is in all expenditure types except for pharmaceuticals. Therefore, our evidence suggests that when region-state governments which are in turn politically accountable are invested with powers to raise taxes to pay for health care they do expand expenditure, given that health care is a high public policy priority where people mind less paying taxes for. Finally, the spatial coefficient is positive and statistically significant, and therefore, the z test rejects the null hypothesis of absence of spatial interactions. In particular, the spatial lag dependent variable is significant with parameter 0.01 with a p value of 0.00. Results are consistent with previous studies that showed some degree of spending interdependence between political units (Moscone and Knapp 2005; Revelli 2002, 2006; Moscone et al. 2007a, b). 6 Discussion 6.1 Limitations It is important to stress some limitations of our empirical study. In our applied work we have estimated separately one equation for each category of spending. An alternative

182

J. Costa-Font, F. Moscone

approach is the estimation of a system of equations connected through cross-equation error correlation, in a SURE type framework (Zellner 1962; Moscone et al. 2007b). This, however, would entail significant computational complexities that are beyond the scope of the present paper and may be the subject of future work. Further, though we have used the contiguity criterion of neighbourliness, this is certainly not exhaustive. We cannot rule out the existence of alternative specifications of potential interaction based on economic, political and policy distances. Finally, whilst we recognise that there exist various sources of endogeneity, our models only deal with a particular form of endogeneity, arising from political orientation of Spanish regions and interdependencies of spatial units.

6.2 Concluding remarks This paper has sought to examine the influence of a set of institutional, political and economic determinants of health care activity. Spain is a relevant setting because of significant regional and institutional differences, which can be captured by taking into account unobserved spatial effects. Our contribution to the literature lies in the following findings. We find that although decentralisation initially increases regional health expenditure (e.g. due to the effect of sunk costs, there is evidence of an “experience effect”, indicating that decentralisation enables expenditure cuts in the long-run. However, these effects are different depending on the type of spending. One possible explanation for expenditure heterogeneity can be found in the institutional factors (e.g. responsibilities for drug expenditures are not fully decentralised). When examining several health expenditure sources, there might be potential substitution between different health inputs in producing health activity (and expenditure source) depending on regional specific policy preferences (e.g., some regions rely more heavily on drugs treatments). Compared to other studies, political effects are consistent with classical predictions once an interaction with income is introduced. Regional left wing incumbents raise public health expenditure in relatively richer regions, which is in part due to the increasing competition with the private sector in such areas. Another finding that is worth noting is that regional income does exert a moderate influence in explaining regional expenditure, suggesting evidence of limited regional inequalities among Spanish region states. This has important implications as it suggests that decentralisation is not likely to give rise to regional inequalities. Furthermore, for those components of health care expenditure where income has positive effects, we find that elasticity is below unity. An aging population and a larger concentration of health care providers, as expected, tend to increase the costs of the health system. That is, activity and overall expenditure is expected to grow with higher health care needs and higher intensity in the use of certain health care inputs. Finally, spatial interactions among regions seem to play a role in explaining total expenditure and its major categories (pharmaceutical, inpatient, and ambulatory). Results are consistent with some degree of interdependence in spending behavior between neighbouring regions, corroborating previous findings in public and health economics synthesized in Sect. 4.

The impact of decentralization and inter-territorial interactions

183

Acknowledgments We would like to thank participants at the International Workshop in Spatial Econometrics and Statistics in Rome, the editors Badi Baltagi and Giuseppe Arbia, two anonymous referees for valuable comments and suggestions. We are grateful to Marin Gemmill, Riccardo Maestri, and Elisa Tosetti for helpful comments and discussions on the current version. Finally, Joan Costa-Font is grateful to the support of the Institut Ramon Llull (Genenralitat de Catalunya).

References Alexander F (1993) Viruses, clusters and clustering of childhood leukemia. Eur J Cancer 29:24–43 Anselin L (1988) Spatial econometrics: methods and models. Kluwer, Dordrecht Arbia G (2006) Introductory spatial econometrics with applications to regional convergence. Springer, Berlin Baicker K (2005) The spillover effects of state spending. J Public Econ 89:529–544 Baltagi BH, Song SH, Koh W (2003) Testing panel data regression models with spatial error correlation. J Econom 117:123–150 Besley T, Case A (1995) Incumbent’s behavior: vote seeking, tax-setting and yardstick competition. Am Econ Rev 85:25–45 Brueckner JK (2000) Welfare reform and the race to the bottom: theory and evidence. South Econ J 66:505– 525 Costa-Font J, Pons-Novell J (2007) Public health expenditure and spatial interactions in a decentralized national health system. Health Econ 16:291–306 Costa-Font J, Rico A (2006) Vertical competition in the Spanish National health System. Public Choice 128:477–498 Crivelli L, Filippini M, Mosca I (2006) Federalism and regional health expenditures: an empirical analysis of Swiss cantons. Health Econ 15:535–541 Davis P, Gribben B, Scott A, Lay-Yee R (2000) The supply hypothesis and medical practice variation in primary care: testing economic and clinical models of inter-practitioner variation. Soc Sci Med 50:407–418 Di Matteo L, Di Matteo R (1998) Evidence on the determinants of Canadian Provincial Government Health Expenditure 1965–1991. J Health Econ 17:211–228 Florax RJGM, Folmer H, Rey SJ (2003) Specification searches in spatial econometrics: the relevance of Hendry’s methodology. Reg Sci Urban Econ 33:557–579 Gatrell AC, Whitelegg J (1993) Incidence of childhood cancer in Preston and South Ribble Research Report Environmental Epidemiology Research Unit, Lancaster University Giannoni M, Hittris T (2002) The regional impact of health care expenditure: the case of Italy. Appl Econ 34:1829–1836 Henrekson M (1988) Swedish government growth: a disequilibrium analysis. In: Lybeck JA, Henrekson M (eds) Explaining the Growth of Government. North-Holland, Amsterdam Hsiao C (2003) Analysis of panel data, econometric society monographs. Cambridge University Press, London Hsiao CK (2000) Comparing the performance of two indices for spatial model selection: application to two mortality data. Stat Med 19:1915–1930 Lopez-Casasnovas G, Costa-Font J, Planas I (2005) Diversity and regional inequalities: assessing the outcomes of the Spanish ‘system of health care services’. Health Econ 14S:S221–S235 Lorant VT, Thomas I, Deilege I, Tonglet R (2001) Deprivation and mortality: the implications of spatial autocorrelation for health resources allocation. Soc Sci Med 53:1711–1719 McMillen DP (2003) Spatial autocorrelation or model misspecification? Int Reg Sci Rev 26:208–217 Moscone F, Knapp M (2005) Exploring the spatial pattern of mental health expenditure. J Mental Health Policy Econ 8:205–217 Moscone F, Knapp M, Tosetti E (2007a) Mental health expenditure in England: a spatial panel approach. J Health Econ 4:659–864 Moscone F, Tosetti E, Knapp M (2007b) SUR model with spatial effects: an application to mental health expenditure. Health Econ Lett 11(2):3–9 Navarro V, Muntaner C, Borrell C, Benach J, Quiroga A, Rodriguez-Sanz M, Verges N, Pasarin MI (2006) Politics and health outcomes. Lancet 368:1033–1037 Newhouse JP (1977) Medical care expenditure: a cross-national survey. J Hum Resour 12:115–125

184

J. Costa-Font, F. Moscone

Newhouse JP (1992) Medical care costs: how much welfare loss? J Econ Perspect 6:3–21 Okunade AA, Murthy VNR (2002) Technology as a major driver of health care costs: a cointegration analysis of the Newhouse conjecture. J Health Econ 21:147–159 Parkin D, McGuire A, Yule B (1987) Aggregate health care expenditures and national income: is health care a luxury good? J Health Econ 6:109–127 Revelli F (2002) Testing the tax mimicking versus expenditure spill-over hypotheses using English data. Appl Econ 34:1723–1731 Revelli F (2006) Performance rating and yardistick competition in social service provision. J Public Econ 90:459–475 Rico A, Costa-Font J (2006) Power rather than path? The dynamics of health care federalism in Spain. J Health Polit Policy Law 30:231–252 Skinner J, Wennberg JE (2000) Regional inequality in medicare spending. The key to medicare reform? Front Health Policy Res 3:89–96 Tavares J (2004) Does the right or left matter? Cabinets, credibility and adjustments. J Public Econ 88:2447– 2468 Thouez JP, Emard JF, Beaupre M, Latreille J, Ghadirian P (1997) Space-time analysis of the incidence of cancer in certain sites of Quebec: 1984–1986 and 1989–1991. Can J Public Health 88:48–51 Zellner A (1962) An efficient method for estimating seemingly unrelated regressions and tests of aggregation bias. J Am Stat Assoc 58:977–992

Regional evidence on financial development, finance term structure and growth Andrea Vaona

Abstract The finance-growth nexus is a classic source of debate among economists. This paper offers regional evidence on this issue in order to determine whether it can fit the data on a 147-year-old economic union, Italy. By means of this approach the pooling of developed and developing countries in the same sample can be avoided. Both crosssectional and panel data estimates appear to show that more finance generates more growth. Endogeneity does not bias the results to a significant extent, and the financegrowth nexus is robust to spatial unobserved heterogeneity. Spatial correlation in the residuals is rejected by the data. Economic growth appears to be favoured more by short-term than by long-term credit. Keywords Finance-growth nexus · Regions · Finance term structure · Cross-section analysis · Panel data analysis JEL Classification O18 · O16 · C31 1 Introduction The relationship between financial development and economic growth has long been debated by economists. Various approaches to the issue have been surveyed by Levine (2004), who discusses both theoretical and empirical studies. The latter range among

A. Vaona (B) Department of Economics and Quantitative Methods, University of Pavia, Via S. Felice 5, 27100 Pavia, Italy e-mail: [email protected] URL: http://www.webalice.it/avaona A. Vaona Kiel Institute for the World Economy, Kiel, Germany

186

A. Vaona

historical case studies, firm-level studies, time series studies on an individual country or on a limited number of countries, cross-sectional and panel data analyses. These last comprise studies focused on industries, like Rajan and Zingales (1998), and those surveyed below which focus on countries. The aim of this paper is to offer new perspectives on this long-standing debate by analysing the effect of financial development, defined as enlargement of the banking sector, on growth by using a regional dataset. In this way, it will be possible to avoid pooling developed with developing countries, where the economic mechanisms at work may differ greatly as argued by Usai and Vannini (2005) and shown by Schiavo and Vaona (2007). By focusing on a country like Italy, where regional disparities have been a controversial issue since national unification in 1860, it is possible to maintain substantial variability within the sample. Moreover, regional data on Italy have recently attracted considerable attention in studies on various aspects of financial development (Guiso et al. 2004a,b, 2006; Usai and Vannini 2005). Driffil (2003) claims that growth theories based on agglomeration economies and falling transport costs may offer more valuable insights than those concerned with the link between finance and growth. As a consequence, a regional dataset may enable valid tests regarding the robustness of the finance-growth nexus because such a dataset represents a limit condition of economic integration as compared to cross-country datasets (Guiso et al. 2004a). If agglomeration forces and the dynamics of transport costs are the dominant factors explaining economic growth, the finance-growth nexus should disappear within countries. Contrary to Guiso et al. (2004a), this study does not consider indicators of financial development derived from micro data; rather, it considers aggregate ones directly concerning the size of the banking sector relative to the local economy as a measure of its degree of financial intermediation. As a consequence, the results of this study are more directly comparable with those set out in the cross-country literature. Moreover, it is possible to introduce within a regional setting the methodological advances achieved by the cross-country literature in the last 15 years. This study consequently considers not only cross-sectional estimators but panel data ones as well. In both cases, estimates robust to unobserved heterogeneity are reported, which is important given the sensitivity of growth studies to model misspecification and to the omission of technological progress (Levine and Renelt 1992; Islam 1995; in the finance-growth literature Driffil 2003 and Manning 2003). Consideration of a panel dataset also makes it possible to test for the poolability of the regions involved in the present study, following Schiavo and Vaona (2007). Finally, providing both cross-section and panel data estimators is important because it enables to compare the results obtained here with those of the cross-country literature and of other regional studies using different financial development indicators. In this study spatial correlation does not affect the models estimated. The importance of testing for spatial correlation when analysing the impact of local financial development on growth has to date been overlooked. Guiso et al. (2004a) rightly point out that distance is very important in the credit market because it may produce geographic segmentation. If this is the case, local financial variables will have a statistically significant impact on real variables. However, if the model estimated does not fully capture the links among different regions within the credit market, the residuals

Regional evidence on financial development, finance term structure and growth

187

will display spatial correlation that produces biased standard errors and unreliable statistical inferences. The paper considers a finer level of geographical disaggregation than that examined by Usai and Vannini (2005). The latter analyse NUTS2 regions, whereas this study is concerned with NUTS3 regions,1 the purpose being to offer results comparable to those of cross-country studies, and to consider small open economies in light of the analogy with a hypothetical, fully-integrated world economy proposed in the literature.2 This approach also makes it possible to adopt dynamic panel data estimators, so that the problems of endogeneity and unobserved heterogeneity can be addressed more satisfactorily. Finally, given that the Bank of Italy collects financial data distinguished between long and short-term credit, it is possible to assess the impact of different financial term structures on local growth rates. This is particularly interesting because studies on financial structure usually focus more on its effect on firm size or on the opportunities for firm growth than, as here, on its aggregate effect on economic growth (Caprio and Demirgüç-Kunt 1997). In an economy especially reliant on small firms like Italy’s, short-term credit may enable the funding of long-term projects, given that small firms usually have less collateral than large ones and may be rationed when applying for long-term credit. Moreover, this may be particularly the case in lagging regions, where opportunistic behaviour is more common and monitoring costs are greater, so that firms operating in different regions have different access to credit. The rest of this paper is structured as follows. First, a brief survey is conducted on studies regarding both the link between finance and growth across countries and firm debt structure, the purpose being to show the main econometric issues tackled by the relevant literature. Next, the model specification, the data collected and the econometric methods of the paper are described. Finally, estimation results are illustrated, while the last section concludes. 2 Literature survey The literature survey which follows deals mainly with cross-country studies that define financial development as improvement in the working of banks. However, there exist other studies which consider financial development in terms of institutional changes or a deepening of the stock market (see for instance Levine and Zervos 1998 or Beck and Levine 2004, and others surveyed by Levine 2004). Previous research has been mainly concerned with the following econometric issues: model specification, the endogeneity of financial indicators, unobserved heterogeneity, and the frequency of the data. 1 NUTS is the French acronym for Nomenclature of Territorial Units for Statistics used by Eurostat. In this

nomenclature NUTS1 refers to European Community Regions and NUTS2 to Basic Administrative Units, while NUTS3 is the label for smaller spatial units more similar to counties in the US. To be noted is that the datasets used by the present study have a cross-sectional dimension very similar to those used in the cross-country studies reviewed by Levine (2004). 2 Guiso et al. (2004a) argue that both the Italian Antitrust Authority and the Bank of Italy regard provinces

as the “relevant market” for banking.

188

A. Vaona

Since the seminal contributions by King and Levine (1993a,b), attention has focused on whether financial development is a precondition for or a consequence of economic growth. Various studies have been conducted with different model specifications and, consequently, conclusions. King and Levine (1993a,b,c), extending the analysis of Goldsmith (1969), carry out a cross-sectional analysis of a dataset of 80 countries over the period 1960 – 1989 in order to determine whether financial development can be considered a predictor of future long-run growth, capital accumulation and productivity growth. They propose four measures of the level of financial development: • DEPTH: liquid liabilities of financial intermediaries over GDP; • BANK: the ratio of private bank credit over the sum of private bank credit and central bank credit; • PRIVATE: the ratio of the credit allocated to private enterprises over total domestic credit; • PRIVY: the ratio of the credit to private enterprises over GDP. The model specification is as follows: G = α + βF + γ X + ε

(1)

where G is either per capita GDP growth, or growth of the capital stock per head, or productivity growth; F is either DEPTH or BANK or PRIVATE or PRIVY; and X is a set of controls (income per capita, education, political stability, indicators of exchange rate developments, international trade, fiscal and monetary policy). α, β and γ are coefficients, while ε is the stochastic error. King and Levine (1993a,b,c) conclude that the level of financial development at the beginning of the period can be considered as a good predictor of future economic growth. More recently, much research effort has been devoted to analysing potential biases deriving from the endogeneity of financial development measures with respect to growth. Levine and Zervos (1998); Levine (1999) and Levine et al. (2000) use the La Porta et al. (1998) measures of legal origin as instrumental variables. In particular, La Porta et al. (1998) show that legal origin—whether a country’s Commercial/ Company Law derives from British, French, German, or Scandinavian law – considerably affects the letter and enforcement of national credit laws, yielding different results in the protection of external investors and promoting financial development to different extents. Levine et al. (2000) analyse 71 countries, adopting the generalized method of moments (GMM) estimator and considering a model similar to (1), where G is real per capita GDP growth over the 1960–1995 period. Measures of financial development are instrumented with legal origin indicators. The variables included in X, the conditioning set, are treated as exogenous. They also cover a longer time span than King and Levine (1993a,b), including the years from 1989 to 1995. Levine et al. (2000) add a new measure of overall financial development called Private Credit, which is defined as the value of credit by financial intermediaries to the private sector divided by GDP. While PRIVY includes credit issued by the monetary authority and government agencies, Private credit includes only credit issued by banks and other financial

Regional evidence on financial development, finance term structure and growth

189

intermediaries. This measure also isolates credit issued to the private sector and therefore excludes credit issued to governments, government agencies and public enterprises.3 The above studies conclude that financial development plays a first-order role in explaining economic growth. However, both Manning (2003) and Driffil (2003) have recently argued that these studies may not have properly considered the role of unobserved country heterogeneity. They show that, within a cross-sectional setting, the effect of financial development on growth disappears once dummies for some subsets of countries are inserted, either according to the continent in which they are situated or because they have achieved outstanding growth performance (the “Asian tigers”, for instance). These results induce Driffil (2003) to conclude that New Economic Geography, which relies on agglomeration economies and transport costs, may provide a better account of growth and catching up. Levine et al. (2000) is an important contribution not only for its instrumenting of financial development indicators in a cross-sectional analysis, but also for its use of dynamic panel data estimation, as in Beck et al. (2000). This method yields results robust to unobserved heterogeneity. In order to exploit both time series and crosssection variation, Levine et al. (2000) employ data averaged over 5-year-periods, avoiding the use of data at annual frequency in an attempt to capture long run relationships. If dynamic panel data estimators are used, one can deal with unobserved heterogeneity and instrument not only financial development variables but also the variables belonging to the conditioning set. Levine et al. (2000) examine the relationship between financial intermediation and growth, while Beck et al. (2000) analyse the relationship between financial development and the sources of growth, i.e., productivity growth, physical capital accumulation, and savings. With regard to the frequency of the data, Beck and Levine (2004) check whether the annual frequency of the data affects the results in comparison to those obtained by studies which rely on 5 year averages. They find that the relationship between Bank Credit and growth disappears when annual data are used. Connecting this result to Loayza and Ranciere (2004), they argue that short-run surges in Bank Credit are good predictors of banking crises and slow growth, while high levels of Bank Credit over the long run are positively associated with economic growth. These results emphasize the importance of using sufficiently low-frequency data in order to move beyond cyclical effects. Turning to the literature on the finance term structure, this has mainly dealt with firm level data of developing countries. It is difficult to tell a priori whether either short-term or long-term credit is more effective in supporting economic development. On the one hand, pervasive market imperfections may prevent firms in developing countries from establishing long-term relationships with banks and from financing 3 In regard to deflation of the financial development indicators, while the balance sheet items of financial

intermediaries are measured at the end of the year, GDP is measured over the year. Levine et al. (2000) deflate end-of-year financial balance sheet items by end-of-year consumer price indexes (CPI) and deflate the GDP series by the annual CPI. They then compute the average of the real financial balance sheet items in year t and t − 1 and divide this average by real GDP measured in year t.

190

A. Vaona

far-reaching projects that may generate economic growth. On the other hand, short-term credit may induce banks to exercise closer control over borrowers and projects. Moreover, public banks focusing on long-term credit are faced by the same accounting and monitoring problems as private ones. Finally, short-term credit may reflect new information better, but long-term credit may protect firms against creditors’ imperfect information and opportunistic behaviour, as well as against temporary shocks (Caprio and Demirgüç-Kunt 1997; Diamond 1991). The dataset analysed here provides a particular standpoint from which to assess the effect of finance term structure on growth. Italy is well-known for the economic importance of small firms, and for the social ties that often connect various firms together, and firms to banks, which induces the formation of industrial districts (Observatory of European SMEs 2003a,b; Becattini et al. 1992). These are two countervailing forces: small firms are usually discriminated against when applying for long-term credit; but at the same time the milieu of industrial districts may favour the formation of long-term relationships between banks and firms, so that the latter can fund long-term projects by resorting to short-term credit. 3 Model specification and data issues Cross-section data were first analysed. For this purpose, we adopted a model specification similar to (1) which regressed the percentage growth rate of real per capita value added in the Italian provinces between 1986 and 2003 (G) on a financial development indicator and a number of controls, taken at their 1986 values.4 Controls (X) were the sum of exports and imports over value added, the number of students enrolled at secondary school over local resident population, the value of finished public infrastructures over value added, the number of crimes per head, and the level of provincial value added per head. In order to deflate value added, we used the consumer price index (CPI), which in Italy is measured in the main cities of NUTS2-regions and NUTS3-provinces. Crosssectional estimates relied on the CPI of the main cities of NUTS2-regions, because using the CPI of those of NUTS3-provinces entailed losing about one third of the observations.5 This choice may have introduced some measurement error into the dependent variable, but this kind of measurement error does not affect coefficient estimates and standard errors (Wooldridge 2001). The level of provincial value added per head was not affected by measurement error because 1986 was taken as the base year. Given that the analysis was concerned with provinces, exports and imports only included international trade, not trade with other Italian provinces, which is of course not registered at custom offices. However, more internationalised regions may achieve faster growth by exploiting international comparative advantages, so that it appeared advisable to include this control as well. 4 Cross-sectional estimates cannot be interpreted as resulting from a pooled OLS panel estimator as the

dependent variable is the future growth rate, while regressors are taken at their value at the beginning of the period of observation. 5 Vaona (2006) sets out results obtained deflating value added not only by the regional CPI but also by the

provincial one. Estimates are stable.

Regional evidence on financial development, finance term structure and growth

191

As regards indicators of financial development (F), two possibilities were available: • the ratio of short-term credit over value added; • the ratio of long-term credit over value added. Therefore, the measures of financial development adopted were very similar to PRIVY used by King and Levine (1993a,b,c) and they both concerned financial intermediation. When the panel dataset was analysed, estimates for both a static and a dynamic model were implemented. In the former case, a model specification similar to (1) was adopted, regressing a three (six) year average of the percentage growth rate of real per head value added on the financial indicators (short-term or long-term credit over value added). We included all the controls used in the cross-sectional estimates except the value of finished public infrastructures over value added, which is not available after the year 2000. To capture convergence forces, the model also considered the real value added per head at the beginning of each of the three (six) year time periods, as in Kahn and Senhadji (2001) and in the literature surveyed in Vaona and Schiavo (2007). Regressors were thus selected so that comparison between the panel and crosssectional estimates would be made straightforward. Both three and six-year averages were considered in order to check whether the frequency of the data affected the coefficient estimates.6 When a dynamic model was used, the log of real per head value added was regressed on its first lag, the log of the financial indicators and the usual controls. The log of the financial indicators was used to capture possible non-linearities in the relationship between finance and growth, as in Levine et al. (2000). Summing up, the model specification was as follows yi,t = αyi,t−1 + β ′ X i,t + ηi + εi,t

(2)

where yi,t is the log of real per capita value added at time t in province i, X i,t is a set of controls including financial indicators, ηi is an unobserved province-specific effect, and εi,t is a stochastic error. Regional dummies displaying strong explanatory power in the cross-sectional regressions were also inserted in order to check whether their effect carried over to the dynamic panel model. In the panel estimates, data deflated by the CPI in the provinces’ main cities were used, given that the problems of sample size were less binding in this case. The data involved in this study and their sources are shown in Table 1. Descriptive statistics regarding both cross-sectional and panel data for the dependent variable and the main indicators of financial development are set out in Table 2. They show that there was substantial variability in the sample. The minimum growth rate between 1986 and 2003 was exhibited by the province of Rieti (−0.5%), and the maximum one by the province of Potenza (+79.7%). Also financial indicators display marked variability. For instance, in 1986 long term credit over value added reached its minimum value in the province of Benevento (7%) and its maximum one in the province of Rome (31%). Similarly, in 1986, short-term credit over value added varied from 10 to 57%, while for 6 Three year averages were also used in de la Fuente (2002).

192

A. Vaona

Table 1 Data and sources Data

Sources

Value added

Tagliacarne Institute

Exports

ISTAT

Imports

ISTAT

Inflation measured in the region’s and in the province’s main city in CPI

ISTAT

Number of students enrolled at secondary schools

ISTAT

Value of finished public infrastructures

ISTAT

Value of short-term bank credit

Bank of Italy

Value of long-term bank credit

Bank of Italy

Resident population

ISTAT

ISTAT is the Italian National Statistical Office Table 2 Descriptive statistics of the growth rate of real value added per capita and of the main financial indicators used in the cross-sectional and panel estimates (three year averages)

Cross-section

Panel

Variable

Observations

Mean

SD

Minimum

Maximum

Total percentage growth rate of real per capita value added between 1986 and 2003 Short-term credit over value added in 1986 Long-term credit over value added in 1986 Average percentage growth rate of real per capita value added between 1986 and 2003 Short-term credit over value added Long-term credit over value added

94

35.2

14.1

−0.5

79.7

94

1.4

0.5

0.7

3.1

94

2.6

0.9

1.0

5.7

401

2.0

3.3 −14.4

34.6

401

2.5

1.3

0.8

8.4

401

1.5

1.2

0.1

8.7

The financial indicators are measured in millions of lire over ten millions of lire. Percentage numbers for financial indicators can be obtained by multiplying the figures in the table by 10

instance PRIVATE CREDIT in Levine et al. (2000) varied from 4% in Zaire to 141% in Switzerland, which is indicative that pooling underdeveloped and developed countries may not be thoroughly informative. Also the panel data show a good variability, though it is less marked than in cross-country studies. Figure 1 provides geographical evidence on the percentage growth rate of real per capita value added in the Italian provinces between 1986 and 2003 (G), short and long-term credit over value added. It also shows the four macro-regions into which Italy is usually divided: the North-west, the North-east, the Centre and the South and Islands. Historically, the North-west has been the most developed part of the country, while the South and Islands has been the most backward one.7 7 Usai and Vannini (2005) provide a descriptive picture of the Italian banking system.

Regional evidence on financial development, finance term structure and growth

193

Fig. 1 Geographical evidence regarding the growth rate of per capita value added between 1986 and 2003 (G), the ratio of total short-term credit over value added in 1986 (CREDY), the ratio of long-term credit over value added in 1986 (LTCREDY), and the Italian macro-regions

Between 1986 and 2003 the North-east, the Centre and the South of Italy experienced a higher growth rate of real per capita value added than did the North-west. This is a sign of convergence within Italy, given the leading position of the North-west with respect to the country’s other macro-regions at the beginning of the observation period. Inspection of the financial indicators shows that while the ratio of short-term credit over value added was much higher in the northern part of the country, the same did not hold true for long-term credit over value added. It is evident that in 1986 the banking sector was transferring resources from the North to the South in order to boost the catching-up process by financing long-term projects.

194

A. Vaona

This scenario drastically changed over the period analysed. Vaona (2006) shows that while short-term credit was mainly channelled to northern provinces in both 1986 and 2003, long-term credit was redirected from southern provinces to those in the North-east during the same period. From an economic point of view, this means that resources were diminishing in the backward part of the country, to the benefit of regions experiencing fast economic growth. From a methodological point of view, this highlights the need to consider panel data estimators in order to capture dynamic changes in financial indicators over the period under analysis. 4 Econometric methods Let us first consider the cross-section estimates. Model (1) did not include important regressors used in the growth literature, such as the size of current public expenditure or an indicator of capital accumulation. In order to control for omitted variables, the data of the various NUTS3-provinces were grouped according to the NUTS2-region in which they are situated, and the dataset was used as if it were an unbalanced panel, since each NUTS2-region has a different number of NUTS3-provinces.8 This step is important primarily because cross-sectional studies of economic growth have been criticized for being unable to account, as panel studies can, for the unobservable level of technology (Islam 1995; Caselli et al. 1996; de la Fuente 2002). Although there are presumably major technological differences among NUTS2-regions, they are less likely to be a highly significant factor within those regions. Secondly, it is thus possible to deal with the problems highlighted by Driffil (2003) and Manning (2003). The analysis presented relied on the Fixed Effects estimator.9 In order to check for endogeneity of financial development indicators, the 2 stage least squares dummy variables estimator (2SLSDV) was adopted. We used as instruments the geographical dummies that did not appear to be correlated with future growth in the Fixed Effects regression and which passed at the 5% level an F-test on their correlation with the instrumented variables (Wooldridge 2001). Using as instruments the geographical dummies not correlated with future growth was important in order to extract the exogenous part of the finance-growth nexus, excluding the dummies of regions where credit flowed because of their good economic prospects. On the other hand, the regional dummies not correlated with future growth, but instead with financial indicators, may play a role similar to that of the indicators of legal origin in the cross-country literature. In fact, whilst the letter of the law is the same within a country, the manner, efficacy and efficiency with which it is applied may vary from region to region, especially in the presence of markedly different local practices in a country like Italy, which achieved national unity much later than many of the other European countries. 8 There were 21 groups (one for each of the Italian NUTS2 regions) which ranged from a minimum of one

observation (Valle d’Aosta) to a maximum of nine observations (Tuscany and Sicily). 9 Following Baltagi (2003), Vaona (2006) computes not only the Fixed Effects but also five different

Random Effects estimators: the Wallace and Hussain one, the Swamy and Arora one, the Henderson, Fuller and Batese one and two minimum norm quadratic unbiased estimators. Results are stable across different Random Effects estimators, signalling the absence of major misspecification errors. A Hausman test favours the Fixed Effects estimator over the Random Effects ones.

Regional evidence on financial development, finance term structure and growth

195

We tested for endogeneity of the financial indicators by means of a Durbin-WuHausman test which compared the 2SLSDV estimator with the Fixed Effects one. In order to assess the validity of overidentifying restrictions, we also computed the test statistic given by the product between the number of observations and the R2 of the regression of the residuals of the 2SLSDV estimator on the control variables and the instruments (Wooldridge 2001). Finally, in order to check for spatial correlation in the residuals, we followed Anselin (1988) and we computed the Moran’s I statistic for all the estimators except 2SLSDV. For 2SLSDV the key reference is Anselin and Kelejian (1997), given that instrumental variables estimators require a specific Moran’s I statistic. Panel data estimators were also implemented in order to obtain further results able to meet the above-discussed criticisms of cross-sectional estimates. One of the estimators most frequently used in the growth literature is the System GMM estimator developed by Blundell and Bond (1998). The validity of this estimator hinges on the absence of second-order serial correlation in the residuals, which can be tested by means of the statistic proposed by Arellano and Bond (1991). It is customary to insert time dummies in the estimated model not to obtain residuals with second-order serial correlation. To deal with the possible endogeneity of financial indicators, the System GMM estimator was also adopted when estimating the static panel model. We used the Windmeijer (2005) small sample correction for both the static and the dynamic model to have reliable standard errors, and we performed the estimation on the basis of Roodman (2005). When we tested for spatial correlation in the residuals of GMM estimators, we again drew on Anselin and Kelejian (1997). Following Baltagi (2003) and Schiavo and Vaona (2007), for the static panel model we computed a Roy-Zellner test for poolability in order to check that excessive heterogeneity within the sample did not prevent us from obtaining stable coefficient estimates. The null hypothesis was that the coefficients of the financial indicators would be identical across different provinces, whereas the alternative was that different provinces had different coefficients. Because we had an unbalanced dataset, we estimated the variance covariance matrix of the errors by relying on Davis (2001). 5 Estimation results Table 3 sets out the cross-sectional results. Financial variables are positively and significantly correlated with future real growth. Their endogeneity is rejected when 2SLSDV and the Fixed Effect estimator are compared. Instruments pass the F-test for correlation with the instrumented variables at a 5% level for all the specifications, and over-identifying restrictions cannot be rejected. Finally, unlike the findings of Driffil (2003) and Manning (2003), the coefficients of the financial indicators remain positive and significant even adopting a Fixed Effects estimator.10 Considering 10 In order to control for the possible effect of the economic specialization of provinces, we also inserted into

the model first the ratio between value added in agriculture and in manufacturing and then the ratio between value added in agriculture and in the service sector. We used a Fixed Effect estimator, and the results were stable when compared with those in Table 3. The new variables did not prove to be significantly different from zero.

196

A. Vaona

Table 3 The effect of financial development on real economic growth in cross-section models—dependent variable: total real growth rate of per head value added between 1986 and 2003 Fixed Effects Short term credit over value added in 1986 t-statistics Long term credit over value added in 1986 t-statistics Sum of exports and imports over value added in 1986 t-statistics Students attending secondary school over resident population in 1986 t-statistics Value of finished public infrastructures over value added in 1986 t-statistics Real value added per head in 1986 t-statistics Crimes per head in 1986 t-statistics Constant t-statistics Dummy Campania

5.74∗ (4.00)

2SLSDV 8.71∗ (2.90)

Fixed Effects – –

2SLSDV – –

– – 0.02

– – −0.01

8.68∗ (3.46) 0.30

13.17∗ (2.32) 0.20

(0.02) −3.06

(−0.01) −3.52

(0.18) −3.93

(0.12) −4.26

(−1.26) 0.11

(−1.39) 0.16

(−1.56) 0.08

(−1.64) 0.09

(1.29) −45.34∗ (−7.41)

(1.63) −48.41∗ (−7.08)

(0.91) −37.86∗ (−5.86)

(1.02) −35.71∗ (−5.09)

2.04 (1.87) 94.93∗ (5.75) −21.36∗

1.49

0.98

(1.23) 93.94∗

(0.78) 93.99∗

(−0.08) 88.76∗

(5.55) −19.78∗

(5.36) −20.35∗

(4.72) −18.05∗

−0.15

t-statistics Dummy Puglia t-statistics

(−4.08) −31.44∗ (−5.25)

(−3.56) −30.07∗ (−4.81)

(−3.71) −30.32∗ (−4.84)

(−2.93) −27.98∗ (−4.05)

Dummy Sicilia t-statistics

−14.45∗ (−3.00)

−13.08∗ (−2.57) –

−11.17∗ (−1.99) –

– 9.91∗ (2.47)

– 10.83∗ (2.57)

Dummy Trentino Alto-Adige

20.21∗

−13.83∗ (−2.78) 21.95∗

t-statistics Dummy Emilia Romagna t-statistics

(2.78) – –

(2.89) – –

R2 Moran’s Ia Durbin-Wu-Hausman test ( p-value)b Instrumental variable F-test ( p-value)c Test for overidentifying restrictions( p-value)d Observations

0.56 −0.46 –

– – 94

0.54 −1.18

0.99 0.02 0.20

94

0.54 −0.33 − –

– 94

0.52 −0.71

0.99

0.03 0.19 94

Asterisks denotes coefficients significant at the 5% level. t-Statistics are shown in parentheses. Instruments in the 2SLSDV regression in the second column include the dummies for the regions Basilicata, Calabria, Emilia Romagna, Lazio, Marche, Molise, Sardegna, Toscana, Umbria and Valle d’Aosta. Instruments in the 2SLSDV regression in the fourth column include the dummies for the regions Calabria, Friuli-Venezia Giulia, Lazio, Liguria, Lombardia, Marche, Piemonte, Toscana, Veneto a the null is no spatial correlation b the null is no endogeneity in the comparison between the Fixed Effects and the 2SLSDV estimators c the null is that the instruments are not significantly correlated with the instrumented variables d the null is that over-identifying restrictions are not rejected

Regional evidence on financial development, finance term structure and growth

197

both short and long-term credit over value added, the dummies for three southern regions—Campania, Puglia and Sicilia—appear to have negative and very significant coefficients. Remarkably Campania and Sicilia are two of the Italian regions with the highest levels of organised crime. In the Fixed Effects estimates, we dropped dummies not significantly different from zero for the sake of parsimony. Confirmation of the finance-growth nexus is also forthcoming when the static and dynamic panel data estimates are considered (Table 4). In order to ensure that the possible endogeneity of financial indicators did not bias the results, we excluded their lags and the lags of their differences from the instrument sets. Only the lags of the levels and first differences of the other regressors were included. Specification tests supported the model and no serial correlation was detected. Consequently, we did not insert any time dummy for the sake of parsimony. Furthermore, Table 4 shows a Wald test of equality between two estimators respectively obtained using three and six-year averages: the null of equality between the two estimators could not be rejected at a 5% level, which supports the view that different data frequencies do not affect the results. No evidence of spatial correlation was found. When we performed dynamic estimates, two regional dummies were significant at a 5% level, respectively for Puglia with a negative sign, and Emilia Romagna with a positive sign, which mirrors the crosssectional results. Unlike in Schiavo and Vaona (2007), who analysed the cross-country dataset used in Levine et al. (2000), a Roy Zellner test could not reject the null of poolability. This showed that cross-region estimates may display much more stability than crosscountry ones. With regard to the finance term structure—with the exception of the estimates for the dynamic panel model—it was not enough to compare the coefficient of long-term credit over value added with that of short-term credit over value added because they are not elasticities. We first examine the cross-sectional results. To determine whether short-term or long-term credit had a greater impact on growth, we considered the provinces with the minimum value of long and short-term credit over value added in 1986 and computed by how much their growth rate would have increased if they had the average value of the financial indicators analysed. The province with the lowest value of long-term credit in 1986 was Benevento. If it had the average value of long-term credit over value added, the model presented in Table 3 would imply an overall faster growth of 1.3% over the period from 1986 to 2003. On the other hand, the province with the lowest value for short-term credit over value added in 1986 was Isernia: if it had the average value of short-term credit over value added, the model presented in Table 3 would imply an overall faster growth of 7.8% over the period analysed. Comparing the effect of short and long-term credit over value added in the static panel estimates led to the same conclusions. Moving the province with the smallest value of short-term credit over value added to its average sample value would increase the growth rate of per capita real value added from 2.5 to 9.9% over a 3-year-period. Performing the same exercise with long-term credit over value added, the economic growth rate would change from 6.9 to 10.1%. The coefficient estimates in the dynamic panel specification are close to one another, but the point estimate of the coefficient of short-term credit over value added is still greater than that of long-term credit over value added.

198

A. Vaona

Table 4 The effect of financial development on real economic growth—static and dynamic panel estimates Static Panel

Dynamic Panel

Long term credit over value added t-statistics

–

1.92∗

–

(2.58)

Short term credit over value added t-statistics

5.30∗

Real value added per head at the beginning of the 3-year-period t-statistics Students attending secondary school over resident population t-statistics Sum of exports and imports over value added t-statistics Crimes per head

–

Log (Long term credit over value added) t-statistics Log (Short term credit over value added) t-statistics

–

Log(real per head value added)t−1

(−3.62)

(−4.75)

t-statistics

(17.49)

(11.41)

Students attending secondary school over resident population t-statistics

−0.0001

−0.0001

−1.03 (−0.58)

0.66

(0.37)

−2.11

−0.98

(−1.63)

(−1.24)

Sum of exports and imports over value added t-statistics

(0.84)

(0.86)

t-statistics

Constant

15.41

16.56

Dummy Puglia

t-statistics

(1.48)

(1.68)

t-statistics

0.04

0.06

Dummy Emilia Romagna

0.08

0.12

t-statistics

0.11

0.16

0.07

0.08

0.93

0.84

Test for first order serial correlation ( p-value)a Test for second order serial correlation ( p-value)b Test for overident. restrictions ( p-value)c Moran’s I ( p-value)d

0.99 73

0.99 73

Crimes per head

Number of provinces

–

– 0.0481∗

−11.56∗

t-statistics

Number of provinces

–

(2.03)

0.26

Frequency Wald test ( p-value)e Roy-Zellner test ( p-value) f

(3.33)

–

−5.98∗

0.24

Test for first order serial correlation ( p-value)a Test for second order serial correlation ( p-value)b Test for overident. restrictions ( p-value)c MORAN’S I ( p-value)d

0.0474∗

0.7745∗

(−0.47) −0.0004 (−0.37) 0.0006 (1.16) −0.1996∗ (−2.10) 0.1174∗

(2.18) 0.7365∗

(−1.68) −0.0024 (−1.78) 0.0003 (0.75) −0.1429∗ (−2.75) 0.1036∗

(2.33)

(2.02)

0.03

0.03

0.29

0.31

0.33

0.35

0.09 72

0.11 72

Number of instruments

46

46

Number of instruments

72

72

Number of observations

401

401

Number of observations

330

330

Dependent variable. Static Panel: real growth rate of per head value added (three year averages). Dynamic Panel: log of real per head value added Method: System-GMM For the static Panel estimates the instrument set comprises the past lags of the levels of real value added per head, crimes per head and sum of imports and exports over value added; for the Dynamic Panel estimates, the instruments are past first differences and past levels of Log (real value added per head)t−1 , students attending secondary school over resident population, exports and imports over value added, crimes per head. Asterisks denotes coefficients significant at the 5% level. t-Statistics are shown in parentheses a the null is absence of first order serial correlation in the differenced residuals. Presence of first order serial correlation in the differenced residuals does not affect the validity of estimates b the null is absence of second order serial correlation in the differenced residuals c the null is that over-identifying restrictions are not rejected d the null is no spatial correlation e the null is equality between the estimators using three and six year averages f the null is that the coefficient of the financial indicators is the same across different provinces

Regional evidence on financial development, finance term structure and growth

199

The greater impact of short-term credit on growth is hardly surprising, given that in Italy long-term credit is mainly granted to large firms. By contrast, small firms, which have driven the country’s economic development over the past two decades, have had to rely on the renewal of short-term credit, and therefore on good relationships with their banks. Therefore, the abundance of short-term credit in a given province may signal not only a larger availability of capital, but also a better relationship between banks and firms which entails less monitoring costs and a better working of the credit market. 6 Concluding remarks This study has used a regional dataset to test the hypothesis that the level of financial development, defined as the size of the banking sector, spurs economic growth. This approach has first the advantage that it does not require the pooling of developed and developing countries, which have very different features. Secondly, the approach makes it possible to check whether the finance-growth nexus holds even in a highly integrated market like that of a 147-year-old economic union, and to test whether long-term credit has a greater impact on growth than short-term credit. Finally, the measures of financial development adopted here are directly comparable to those of cross-country studies, so that their recent methodological advances can be incorporated into the cross-region literature. The results obtained on the size of the banking sector shed new light on the impact of the financial sector’s functions on economic growth. Levine (2004) points out that the functions of financial systems are to: “produce information ex ante about possible investments and allocate capital; monitor investments and exert corporate governance after providing finance; facilitate the trading, diversification and management of risk; mobilize and pool savings; ease the exchange of goods and services”. The evidence provided by this contribution does not confirm the growth impact of either the monitoring role of banks or their risk management function, or their ability to produce information on investment opportunities. However, the size of the banking sector relative to the size of the economy is an indicator of its ability to allocate capital, to mobilize and pool savings, and to ease the exchange of goods and services. The evidence of this paper supports the claim that the more a financial system is able to provide these functions, the more the economy will benefit in terms of enhanced growth. Tests for the endogeneity of financial development indicators have been rejected and the omission of relevant variables (unobserved spatial heterogeneity) has not had a major effect on the coefficient estimates. Spatial correlation in the residuals does not appear to affect the results obtained here. Unlike in cross-country studies, the estimates appear to be robust to underlying coefficient heterogeneity, because econometric tests did not reject the hypothesis of poolability across different geographic units. Acknowledgments The author would like to thank for their comments Paola Dongili, Angelo Zago, Francesco Aiello, Antonio Accetturo, Söhnke Bertram, Badi Baltagi, an anonymous referee and the attendants at the seminars at the International Workshop on Spatial Econometrics and Statistics in Rome (25–27 May 2006), at IfW in Kiel (13 September 2006) and at the OFCE in Nice (24 November 2006). Adrian

200

A. Vaona

Belton and Kathy Lingo have been very helpful in editing the text. The usual disclaimer applies. Financial support from the Italian Ministry for University and Scientific Research (COFIN 2004—protocol number 2004-132703) is gratefully acknowledged.

References Anselin L (1988) Spatial econometrics: methods and models. Kluwer, Dordrecht Anselin L, Kelejian HK (1997) Testing for spatial error autocorrelation in the presence of endogenous regressors. Int Reg Sci Rev 20:153–182 Arellano M, Bond S (1991) Some tests of specification for panel data: Monte Carlo evidence and an application to an employment equation. Rev Econ Stud 58:153–182 Baltagi BH (2003) Econometric analysis of panel data. Wiley, New York Becattini G, Pyke F, Sengenberger W (eds) (1992) Industrial districts and inter-firm co-operation in Italy. International Institute for Labour Studies, Geneva Beck T, Levine R (2004) Stock markets, banks and growth: panel evidence. J Bank Finance 28:423–442 Beck T, Levine R, Loayza N (2000) Finance and the sources of growth. J Financial Econ 58:261–300 Blundell R, Bond S (1998) Initial conditions and moment restrictions in dynamic panel data models. J Econometr 87:115–143 Caprio G Jr, Demirgüç-Kunt A (1997) The role of long-term finance. Policy Research Working Paper 1746, The World Bank Caselli F, Esquivel G, Lefort F (1996) Reopening the convergence debate: a new look at cross-country growth empirics. J Econ Growth 1:363–389 Davis P (2001) Estimating multi-way error components models with unbalanced data structures using instrumental variables. J Econometr 106:67–95 de la Fuente A (2002) On the Sources of Economic Convergence: A Close Look at the Spanish Regions. Euro Econ Rev 46:569–599 Diamond DW (1991) Financial Intermediation and Delegated Monitoring. Rev Econ Studies 51:393–414 Driffil J (2003) Growth and Finance. Manchester Sch 71:363–380 Goldsmith RW (1969) Financial structure and development. Yale University Press, New Haven Guiso L, Sapienza P, Zingales L (2004a) Does local financial development matter? Quar J Econ 119:929– 969 Guiso L, Sapienza P, Zingales L (2004b) The role of social capital in financial development. Am Econ Rev 94:526–556 Guiso L, Sapienza P, Zingales L (2006) The cost of banking regulation. NBER Working Paper 12501 Islam N (1995) Growth empirics: a panel data approach. Quar J Econ 4:1127–1170 Kahn S, Senhadji A (2001) Threshold effects in the relationship between inflation and growth. IMF Staff Pap 48:1–21 King RG, Levine R (1993a) Finance and growth: schumpeter might be right. Quar J Econ 108:717–737 King RG, Levine R (1993b) Finance, entrepreneurship, and growth. J Monetary Econ 32:513–542 King RG, Levine R (1993c) Financial intermediation and economic development. In: Mayer C, Vives X (eds) Financial intermediation in the construction of Europe. CEPR, London, pp 156–189 La Porta R, Lopez-de-Silanes F, Shleifer A, Vishny RW (1998) Law and finance. J Polit Econ 106:1113– 1155 Levine R (1998) The legal environment, banks, and long-run economic growth. J Money Credit Bank 30:596–613 Levine R (1999) Law, Finance, and Economic Growth. J Financ Intermed 8:36–67 Levine R (2004) Finance and growth: theory and evidence. NBER Working Paper 10766 Levine R, Renelt D (1992) A sensitivity analysis of cross-country growth regressions. Am Econ Rev 82:942–963 Levine R, Zervos S (1998) Stock markets, banks and economic growth. Am Econ Rev 88:537–558 Levine R, Loayza N, Beck T (2000) Financial intermediation and growth: causality and causes. J Monetary Econ 46:31–77 Loayza N, Ranciere R (2004) Financial fragility, financial development, and growth. Policy Research Working Paper Series 3431, The World Bank Manning MJ (2003) Finance causes growth: can we be so sure? Contributions to Macroeconomics 3 Observatory of European SMEs (2003a) Highlights from the 2003 Observatory. European Communities, Office for Official Publications of the European Communities, Luxembourg

Regional evidence on financial development, finance term structure and growth

201

Observatory of European SMEs (2003b) SMEs in Europe. European Communities, Office for Official Publications of the European Communities, Luxembourg Rajan R, Zingales L (1998) Financial dependence and growth. Am Econ Rev 88:559–586 Roodman D (2005) xtabond2: Stata module to extend xtabond dynamic panel data estimator. Center for Global Development, Washington, http://econpapers.repec.org/software/bocbocode/s435901.htm Schiavo S, Vaona A (2007) Poolability and the finance-growth nexus: a cautionary note. Econ Lett (in press) Usai S, Vannini M (2005) Banking structure and regional economic growth: lessons from Italy. Ann Reg Sci 39:691–714 Vaona A (2006) Regional evidence on financial development, finance term structure and growth. Kiel Institute for the World Economy, Kiel Working Paper 1285 Vaona A, Schiavo S (2007) Nonparametric and semiparametric evidence on the long-run effects of inflation on growth. Econ Lett 94:452–458 Windmeijer F (2005) A finite sample correction for the variance of linear efficient two-step GMM estimators. J Econometr 126:25–51 Wooldridge JM (2001) Econometric analysis of cross section and panel data. The MIT Press, Cambridge

Convergence in per-capita GDP across European regions: a reappraisal Valentina Meliciani . Franco Peracchi

Abstract This paper studies convergence in per-capita GDP across European regions over the period 1980–2000. We use median unbiased estimators of the rate of convergence to the steady-state growth path, while allowing for unrestricted patterns of heterogeneity and spatial correlation across regions. By permitting the model parameters to be completely different across regions, not only we avoid imposing strong a priori assumptions but we are also able to analyze the spatial patterns in the estimated coefficients. Our results differ from those found using conventional estimators. The main differences are: i) the mean rate of convergence is much lower; ii) for most regions this rate is zero; iii) the number of regions for which we reject equality in trend growth rates is substantially lower. We also find significant evidence of correlation of growth rates across neighbor regions and across regions belonging to the same country. Keywords Regional convergence . Median unbiased estimation . Heterogeneous panel models JEL Classification C23 . O40 . O52 . R11

1 Introduction This paper studies convergence in per-capita GDP across European regions over the period 1980–2000. The evidence currently available on regional convergence in Europe is mostly based on either cross-sectional “Barro regressions” or fixedeffects estimates. The results obtained vary considerably depending on the regions included, the sample period and the estimation method. V. Meliciani University of Teramo, Teramo, Italy F. Peracchi (*) Faculty of Economics, Tor Vergata University, I-00133 Rome, Italy E-mail: [email protected]

204

V. Meliciani, F. Peracchi

Using cross-sectional “Barro regressions”, Barro and Sala-i-Martin (1991) found that regions within the European Union (EU) experienced convergent growth in per-capita GDP over the period 1950–1985 at an annual rate of about 2%. Their analysis, however, is confined to the richest European countries. Extending the analysis to 1990 and including the Spanish regions, Sala-i-Martin (1996) still finds significant convergence (although at the lower rate of 1.5%) in a regression that contains country dummies. Armstrong (1995) enlarges the sample to Greece, Ireland, Luxembourg and Portugal, and finds that the rate of convergence between 1970 and 1990 has been only some 1% per year. He concludes that rates of convergence, in particular within country convergence, fell from their peak in the 1960s. Neven and Gouyette (1995) also find big differences in the patterns of convergence across subperiods and across subsets of regions. The fixed-effects approach, originally used by Islam (1995) to measure convergence across countries, has been applied to study regional convergence, among others, by Canova and Marcet (1995) for the European regions and by de la Fuente (1996) for the Spanish regions. All these studies obtain much higher convergence rates than those found in cross-country regressions. The convergence process has a different interpretation, however, for it is convergence to country- or region-specific steady-states. Moreover, the high estimated convergence rates are difficult to reconcile with the neoclassical growth theory, for they imply very low (and sometimes negative) capital shares. Canova and Marcet (1995), using a Bayesian estimator which permits the estimation of different convergence rates to different steady-states for each region, find evidence supporting lack of convergence in income levels but some convergence in growth rates. De la Fuente (1998) finds that explicitly allowing for short-term noise reduces the estimated rate of convergence to values which are roughly consistent with an extended neoclassical model. Both cross-sectional “Barro regressions” and fixed-effects estimates place strong a priori restrictions on the model parameters. The former impose complete regional homogeneity in the parameters of the process that describes the evolution of per-capita GDP, while the latter allow for unobserved heterogeneity but confine differences across regions to the intercept of the model. An alternative time-series approach to convergence has been developed by Bernard and Durlauf (1995, 1996). According to this approach a group of countries converge in output when the long-term forecasts of output for all countries are equal at a fixed time t; while countries have common trends in output if the longterm forecasts of output are proportional at a fixed time t: These definitions have natural testable counterparts in the cointegration literature. In fact, convergence requires countries’ outputs to be cointegrated with cointegrating vector ½1; 1; while the existence of common trends only requires the output series to be cointegrated with cointegrating vector ½1; α: This approach does not impose the constraints imposed by cross-country and fixed effects approaches. However, it requires long time-series and does not allow estimating the different parameters of

Convergence in per-capita GDP across European regions: a reappraisal

205

the process that drives the evolution of per-capita GDP, such as the convergence rate and the trend growth rate.1 Unlike previous studies at the regional level, this paper estimates separate processes for each region using the heterogeneous panel approach proposed by Lee et al. (1997) for studying convergence in a panel of countries over the period 1960– 1989. By permitting the model parameters to be completely different across regions, not only we avoid imposing strong a priori assumptions but we are also able to analyze the spatial patterns in the estimated coefficients. We also try to address some problems of this estimation method that have been recognized but not addressed by Lee, Pesaran and Smith. First of all, conventional estimators of the autoregressive coefficient, which capture the rate of convergence to the steady-state growth path, are severely downward biased in short time series. Further, this bias translates into invalid inference about the other model parameters. To deal with these problems, we use median unbiased estimators of the autoregressive parameter, as proposed by Andrews (1993), and construct confidence sets for the other parameters based on these median unbiased estimates. Second, most panel studies of convergence ignore cross-sectional correlation in the regression errors. This is particularly implausible when studying convergence across regions, as contemporaneous shocks are likely to affect simultaneously different regions within the same country, and possibly also across countries. In this paper, we take into account the possibility of cross-sectional correlation by treating regional relationships as a system of seemingly unrelated regression equations. The remainder of this paper is organized as follows. Section 2 presents the basic statistical model and its economic interpretation. Section 3 discusses the issues that arise when trying to allow for complete regional heterogeneity in the model parameters, and describes how they are addressed. Section 4 presents the data used in the empirical analysis. Section 5 reports the results obtained. Finally, Section 6 offers some concluding remarks. 2 The statistical model The basic statistical model in the empirical literature on convergence is the deterministic linear trend model with AR(1) errors Yit ¼ ci þ gi t þ Uit Uit ¼ λi Ui;t 1 þ εit ;

(1)

where Yit is the log of per-capita GDP of region i at time t; λ 2 ð 1; 1 , and εit is an innovation with constant variance σ2i : Notice that innovations may be contemporaneously correlated across regions. The parameters ci and gi respectively measure the mean initial level and the mean growth rate of percapita GDP in region i; whereas the autoregressive parameter λi measures the 1 For completeness, another approach to study convergence in per-capita GDP is to focus on the evolution of its cross-sectional distribution. Using this methodology, Quah (1996) finds that while disparities have decreased between European countries, they have increased across regions within countries.

206

V. Meliciani, F. Peracchi

degree of persistence of the shocks to log per-capita GDP in region i: The parameter ν i ¼ ln λi , defined for λi > 0 , measures the speed of convergence of per-capita GDP in region i to its long-run growth path ci þ gi t , and will be referred to as the “rate of convergence”. The growth equations that are often estimated in cross-sectional studies (the socalled “Barro regressions”) can be obtained from Eq. (1) by imposing equality across regions in all parameters (ci ; gi ; λi ), while the growth equations estimated in the context of fixed-effects models can be obtained by imposing homogeneity in the parameters gi and λi ; leaving the ci unrestricted. If λi ¼ 1; the intercept ci is not identifiable and Eq. (1) reduces to Yit Yi;t 1 ¼ gi þ εit ; namely a random walk with drift gi . Equation (1) may arise as the reduced form of several growth models. Most empirical studies focus on the neoclassical Solow’s growth model (Solow 1956) with no uncertainty, an aggregate Cobb-Douglas production function, initial level of technology A0 , capital share α; depreciation rate of the capital stock δ; savings rate s; growth rate of labor input m and growth rate of technology g: Except for A0 ; all the model parameters are assumed to be time invariant, although they may differ across regions (henceforth, we drop the subscript i whenever this causes no ambiguity). In this model, the dynamic equation for log per-capita GDP is given by Yt ¼ ð1

λÞðc þ gtÞ þ λg þ λYt 1 ;

(2)

where λ ¼ e ν ; ν ¼ ðm þ g þ δÞð1 αÞ is the the rate of convergence, and the parameter c depends on all the model parameters through the relationship α s : c ¼ ln A0 þ ln 1 α mþgþδ Adding an innovation εt to the deterministic relationship Eq. (2) and rearranging terms gives a representation which is equivalent to Eq. (1). More recently, Lee et al. (1997) have developed a stochastic version of the neoclassical growth model where both technology and employment follow AR(1) processes with a linear trend and possibly a unit root. In this model, countries might experience different growth rates even if they have access to the same technology. Equation (1) may be obtained as a reduced form of this model under somewhat stringent assumptions on the correlation between the employment and the technology shock, and the order of magnitude of their autocorrelation coefficients. In this case, the coefficient on the lagged dependent variable also depends on the amount of serial correlation in the technology shocks. In particular, a unit root in output may arise either because of constant marginal productivity of capital (α ¼ 1) or a unit root in technology. 3 Methodology Unlike previous studies at the regional level, this paper estimates model (1) separately for each region, thus allowing for unrestricted parameter heterogeneity and arbitrary correlation in the innovations across regions. This enables us to

Convergence in per-capita GDP across European regions: a reappraisal

207

investigate the extent of convergence and the patterns of spatial correlation across European regions without imposing a priori strong homogeneity restrictions. Estimation and inference about the parameters of model (1) is rather tricky. In carrying out the strategy of estimating the model parameters separately for each region, we need to address three issues: (i) the downward bias in the traditional estimates of the autoregressive parameter λ; (ii) the quality of the inference about the intercept c and the slope g of the time trend, and (iii) the likely correlation of the innovations across regions. As we argue below, the way in which the autoregressive parameter is estimated turns out to be crucial, for it affects inference (point estimation and hypothesis testing) about other parameters, even in the absence of any correlation of the innovations across regions. 3.1 Estimation of λ The most common estimators of λ are the coefficient on Yt 1 in an OLS regression of Yt on a constant, a linear trend and Yt 1 ; and various estimators obtained from the residuals U^ t in an OLS regression of Yt on a constant and a linear trend, P P such as λ^ ¼ Tt¼2 U^ t U^ t 1 = Tt¼3 U^ t2 1 (the unconditional LS estimator), P P λ ¼ Tt¼2 U^ t U^ t 1 = Tt¼2 U^ t2 1 (the conditional LS estimator) and the coefficient of sample correlation between U^ t and U^ t 1 : Notice that only the last estimator guarantees that the estimates of λ will lie within the parameter space ð 1; 1: Although consistent, all these estimators are known to be downward biased in finite samples, and the size of their bias increases with the absolute value of λ and decreases with the sample size T : Not allowing for this bias represents one of the main flaws of existing studies on convergence. Several ways of correcting conventional estimators of λ for their bias have been proposed in the literature (see for example Quenouille 1956 and Orcutt and Winokur 1969). In this paper, we follow the procedure suggested by Andrews (1993), which corrects for median bias. We then use the resulting median unbiased estimates of λ to carry out inference about the parameters of the time trend. An estimator of λ is said to be median unbiased if, for any λ; its sampling median is equal to λ: A median unbiased estimator has the “impartiality” property that the probability of overestimating and underestimating the true parameter λ are the same. Andrews (1993) presents a method for constructing median unbiased estimators of λ in Gaussian AR(1) models. His method may be used to bias-correct any estimator of λ with a continuous and strictly increasing distribution function and a sampling median that is continuous and strictly increasing in λ for 1 < λ 1: Notice that the parameter space includes the case of a unit root process and therefore allows for a smooth transition between the trend stationary case (jλj < 1 )

208

V. Meliciani, F. Peracchi

and the unit root case (λ ¼ 1 ).2 Given an estimator λ^ with median function ζðÞ; a median unbiased estimator of λ is 8 < 1; if λ^ > ζð1Þ, ^ λ~ ¼ ζ 1 ðλÞ; if ζð 1Þ < λ^ ζð1Þ, : 1; otherwise,

where ζ 1 ðÞ is the inverse of ζðÞ and ζð 1Þ ¼ limλ! 1 ζðλÞ: Notice that, by construction, λ~ belongs to the interval ð 1; 1: To see why λ~ is median unbiased ^ . If ζ 1 is notice that, by definition, its median is equal to the median of ζ 1 ðλÞ continuous and strictly increasing on ð 1; 1; it then follows that the median of λ~ ^ ¼ ζ 1 ðζðλÞÞ ¼ λ: Implementation of this method typically is equal to ζ 1 ðλÞ relies on numerical evaluation of the median ζðλÞ of λ^ on a fine grid of λ values, and interpolation to obtain the median function ζðÞ and its inverse ζ 1 ðÞ: Lee et al. (1997) point out that the main drawback of median unbiased estimators of λ is their large sampling variance relative to conventional estimators. In the remainder of this section we investigate whether this larger sampling variance is more than offset by the smaller bias. We report summary statistics based on a set of Monte Carlo experiments for a sample of 21 observations from model (1) with Gaussian innovations. Each experiment consists of 10,000 replications and corresponds to a different value of λ in the range ½ 0:98; 1:00; at intervals of width .02. The same set of pseudorandom numbers is used in each experiment. The conventional estimator of λ is again the coefficient on Yt 1 in an OLS regression of Yt on a constant, a linear trend and Yt 1. We exploit two important properties of the model, namely the fact that when jλj < 1 and the initial value Y0 is random, the sampling distribution of the conventional estimator depends only on λ and the sample size T ; while when λ ¼ 1 it does not depend on the initial value Y0 (see Andrews 1993 for a proof). Thus, we set c ¼ g ¼ 0: For jλj < 1; we randomly draw the innovations from the N ð0; 1Þ distribution and the starting value Y0 from the N ð0; ð1 λ2 Þ 1 Þ distribution, whereas for λ ¼ 1 we set Y0 ¼ 0. Figure 1 compares the median bias, the mean bias, the standard error (SE), and the root mean square error (RMSE) of the sampling distribution of the two estimators of λ: The figure shows that the downward bias of the conventional estimator is very large. For example, its mean bias is equal to −.214 for λ ¼ :60 , −.277 for λ ¼ :80; −.325 for λ ¼ :90 and −.363 for λ ¼ :96: 3 Using the conventional estimator therefore leads to severely underestimate the autoregressive coefficient and to severely overestimate the rate of convergence. Notice that the sampling median of the

2 The method has two limitations. First, it only applies to AR(1) processes. An approximately median unbiased estimator for the AR(p) model has been proposed by Andrews and Chen (1994). Second, it requires knowledge of the shape of the distribution of the innovations. Numerical results presented by Andrews (1993) show that procedures based on the normality assumption are robust to a variety of nonnormal distributions. 3 Detailed tables are available from the authors upon request.

Convergence in per-capita GDP across European regions: a reappraisal

209

conventional estimator is strictly increasing in λ; which is what is required for constructing median unbiased estimators.4 The small-sample bias of the conventional estimator represents a problem for any empirical study of convergence based on short time series. For example, the sample of OECD countries used by Lee et al. (1997) consists of 29 annual observations. In this case, when λ ¼ 1; the sampling median of the conventional estimator of λ can be shown to be equal to .678.5 Considering that the crosscountry median of their estimates of λ is .789 (see their Table 1, p. 370), for more than half of the countries the median unbiased estimator of λ would be equal to 1, implying no convergence. This may explain why their estimates show fast convergence but are nevertheless unable to reject the null hypothesis of a unit root in output. Although the median unbiased estimator always has larger standard error and smaller mean bias than the conventional estimator, the difference in the variability of the two estimators does not increase with λ , while the difference in the bias does. In fact, while the bias and the standard error of the conventional estimator are strictly increasing in λ , the standard error of the median unbiased estimator actually decreases for λ > :58 . It turns out that, for values of the autoregressive parameter above .32, the larger variance of the median unbiased estimator relative to the conventional one is more than offset by its smaller bias. Thus, for values of λ corresponding to those typically found in convergence studies, the median unbiased estimator has smaller root mean square error than the conventional one.6 The efficiency of the median unbiased estimator relative to the conventional one depends of course on the sample size, and is typically reversed in large samples. 3.2 Inference about the time trend Several estimators are available for the parameters ðc; gÞ in model (1). The OLS estimator in a regression of Yt on a constant and the linear trend is unbiased but inefficient. Its inefficiency vanishes in large samples, however, because the columns ð1; 1; . . . ; 1Þ and ð0; 1; . . . ; T 1Þ of the design matrix are close to being linear combinations of two characteristic vectors of the covariance matrix of an AR(1) process.7 When jλj < 1 is known, the best linear unbiased estimator of ðc; gÞ is the GLS estimator, obtained by applying OLS to the data transformed using the feasible GLS (Prais-Winsten) transformation. When λ ¼ 1, the parameter c is not identifiable and the GLS estimator of g is just the sample average of the differences Yt Yt 1. When λ is unknown, a feasible GLS estimator, asymptotically equivalent to GLS, is easily obtained by “plugging-in” a consistent estimate of λ: 4 We have no formal proof that the quantiles of the conventional estimator are strictly increasing in , although numerical calculations for various sample sizes show this to be the case (Andrews 1993). 5 Tables are available from the authors upon request. 6 The same experiment carried out for other conventional estimators of confirms these results. Moreover, all conventional procedures provide very similar results in terms of mean bias, median bias, standard errors and RMSE. 7 Chipman (1979) showed that the greatest lower bound for the efficiency of the OLS estimator of g over the interval 0 < 1 is equal to .7534, approached as T ! 1 and ! 1.

210

V. Meliciani, F. Peracchi conventional

median unbiased

median bias

mean bias .2

0 -.1

0

-.2 -.2 -.3 -.4

-.4 -1

-.5

0

.5

1

-1

SE

-.5

0

.5

1

-.5

0

.5

1

RMSE

.3

.5

.25

.4

.2

.3

.15

.2

.1

.1 -1

-.5

0

.5

1

-1

autoregressive parameter

Fig. 1 Median bias, mean bias, standard error (SE) and root mean square error (RMSE) of conventional and median unbiased estimators of

The approximate GLS estimator proposed by Cochrane and Orcutt (1949) is instead quite inefficient in finite samples, even when λ is known, especially for λ close to unity. The source of the inefficiency is the omission of the first observation. The problems with the Cochrane-Orcutt estimator worsen considerably when λ is unknown. The finite-sample properties of all these estimators have been investigated by Park and Mitchell (1980) and Canjels and Watson (1997). The two studies show that, when λ is estimated in a conventional way, the Cochrane-Orcutt estimator is always less efficient than OLS, while feasible GLS estimators based on the PraisWinsten transformation (either two-step or fully iterated) offer efficiency gains over OLS that range from modest to substantial depending on the value of λ and the sample size. For large values of λ, feasible GLS estimators appear to have a slight edge in small samples over the exact maximum likelihood procedure based on the normality assumption. Because of these results, we henceforth focus on feasible GLS estimators of Eq. (1). When a feasible GLS procedure is used, the way in which λ is estimated is crucial. First, the feasible GLS transformation breaks down when the estimates of λ are greater than one in absolute value. Second, biased estimation of λ may reduce the efficiency gain from using a feasible GLS estimator. Third, and most importantly, they may imply higher probabilities of Type I error than nominal. In fact, the Monte Carlo evidence in Park and Mitchell (1980) reveals large discrepancies between the actual and the nominal level of Wald tests on the trend coefficient when λ is positive and conventional estimates of λ are used. To see the source of the problem, notice that, under model (1), the sampling variance of the

Convergence in per-capita GDP across European regions: a reappraisal

211

^ ¼ q1 σ2 =ðq1 q3 q22 Þ , where q1 , q2 exact GLS estimator var g^ is equal to var ðgÞ and q3 are the following functions of λ and the sample size T q1 ¼ 1

λ2 þ

T 1 X

λÞ2 ¼ ð1

ð1

λÞ½T ð1

λÞ þ 2λ;

t¼1

q2 ¼ ð1

λÞ

T 1 X

½t

λðt

1Þ ¼ ðT

1Þð1

t¼1

q3 ¼

T 1 X

½t

λðt

2

1Þ ¼ T ðT

t¼1

1Þð1

T λÞ ð1 2

2T λÞ

1 6

ð1

λÞ þ λ ;

λÞ þ λ þ ðT

1Þλ2 :

This sampling variance increases monotonically with λ for T fixed. Estimating ^ by “plugging-in” a downward biased estimator of λ leads to underestimate var ðgÞ the sampling variance of g^ and therefore to incorrectly reject a null hypothesis about g with a probability that is larger than the nominal size of the test. Figure 2 reports the results of a set of Monte Carlo experiment that analyzes the actual level of a t test of significance of the linear trend in model (1) estimated by feasible GLS with alternative estimates of λ: The setup of the experiments is exactly the same as in Section 3.1. The figure compares the actual frequencies of Type I error for nominal 5%-level two-sided tests based on conventional and median unbiased estimators of λ: Except for values of λ close to −1, the actual level of the test is always higher than the nominal and the discrepancy between the

conventional

median unbiased

.6

.4

.2

0 -1

-.5

0 lambda

.5

1

Fig. 2 Monte Carlo frequency of Type I error for a nominal 5%-level two-sided t test of significance of the linear trend in model (1) estimated by exact GLS using conventional and median unbiased estimators of

212

V. Meliciani, F. Peracchi

actual and the nominal level increases with λ: The frequency of Type I error is much larger, however, when the conventional estimator of λ is used. For example, when λ ¼ :60 the test based on the conventional estimator rejects in 17.3% of the cases, when λ ¼ :80 it rejects in 27.1% of the cases, when λ ¼ :90 it rejects in 37.0% of the cases, and when λ ¼ :96 it rejects in 45.3% of the cases. On the other hand, when λ ¼ :60 the test based on the median unbiased estimator rejects in 9.8% of the cases, when λ ¼ :80 it rejects in 13.1% of the cases, when λ ¼ :90 it rejects in 17.1% of the cases, and when λ ¼ :96 it rejects in 20.9% of the cases. The use of median unbiased estimators of λ therefore goes a long way towards reducing the discrepancy between the actual and the nominal level of a test, thus providing a simple and viable alternative to the use of generalized bounds tests, as proposed by Dufour (1990), or asymptotically conservative tests, as proposed by Canjels and Watson (1997). Our final concern is the possible correlation of the innovations across regions. It is hard to justify the assumption that innovations in two different regions are uncorrelated. In fact, correlation is likely to be present either between regions in the same country (because of common country-specific shocks) or between adjacent regions in different countries (because of trade and spillover effects). Thus, when testing for equality across regions of the parameters of the time trend one should deal with the fact that the cross-sectional correlation in the innovations may lead to invalid inference if not properly taken into account. Lee et al. (1997) try to remove the contemporaneous correlation by transforming the data in deviations from the country-specific mean. In fact, their procedure is only justified when countries (regions) have the same value of the autoregressive parameter and when the common shocks have the same impact across all countries (regions). In this paper, we follow an alternative route. First we remove the autocorrelation by using the median unbiased estimates of the region-specific autocorrelation coefficient to transform the observations via the exact GLS transformation. We then test for equality of the time trend coefficients between pairs of regions by estimating a seemingly unrelated regression equations (SURE) model on the transformed data in order to take into account the possible correlation in the innovations.8 4 The data Our data come from the REGIO database of Eurostat and are categorized according to the Nomenclature of Statistical Territorial Units (NUTS). Although this categorization consists of three levels (NUTS1, NUTS2 and NUTS3, with NUTS1 corresponding to the coarsest level and NUTS3 to the finest), none of them can be considered as fully satisfactory (Boldrin and Canova 2000). For this reason, we rely instead on the alternative categorization proposed by Paci (1997) and Rodríguez-Pose (1998).

8 Phillips and Sul (2003) show that, in the case of short time series with high degrees of cross sectional dependence, the SURE median unbiased estimator has MSE performance that is 5 times better than that of the OLS estimator and twice as good as that of the SURE estimator.

Convergence in per-capita GDP across European regions: a reappraisal

213

The selected categorization follows two criteria: (i) comparable levels of selfgovernment in countries with a sufficient degree of administrative decentralization (Germany, Belgium, Spain, Italy, France, and partially Portugal and the UK) and, (ii) comparable size in terms of territory or population for the remaining countries (Denmark, Greece, Ireland, Luxembourg, and the Netherlands). It selects regional units corresponding to the following administrative levels: Régions for Belgium, Régions for France, Länder for Germany, Groups of Development regions for Greece, Regioni for Italy, Landsleden for the Netherlands, Regioes autonomas for Portugal, Communidades autónomas for Spain, and Standard regions for the UK. The resulting categorization coincides with NUTS 1 for Belgium, Germany, Greece, Netherlands and the UK, with NUTS 2 for France, Italy, Portugal and Spain, while Denmark, Ireland and Luxembourg are each treated as a single region. A further complication is the fact that, in late 1998, the NUTS has been revised to incorporate changes in the administrative structure of the various countries. There were minor revisions for Finland, Germany and Sweden, but major revisions for the UK. To ensure comparability over time, whenever possible we reclassify the data for 1995–2000 according to the old NUTS. For Germany, we exclude the Eastern Ländern and some other regions for which there is no correspondence between the old and the new NUTS. Moreover we exclude Brussels and three UK regions (North, North-West and South-East) for which data were not comparable across the two classifications. The resulting sample consists of 95 regions followed for each year from 1980 to 2000 (see Table 1)9. GDP data have been converted to a common scale using purchasing power parities (PPPs) rather than exchange rates, since the latter do not take into account differences in purchasing power across countries. Growth rates are computed using per-capita GDP in 1995 PPPs and prices. Due to lack of regional price indices, data have been deflated using the national consumer price index. 5 Empirical results We estimate model (1) separately for each of the 95 European regions using both conventional and median unbiased estimators of λ: After presenting the results obtained under different estimation procedures (Section 5.1), we discuss the evidence on spatial correlation (Section 5.2) and parameter heterogeneity (Section 5.3). 5.1 Parameter estimates Table 2 reports summaries of the distribution of the estimates of the model parameters c , g and λ across regions. Within the neoclassical growth model, c is the steady-state level of per-capita GDP in the absence of technical change, whereas g is the rate of technical change. We also report summaries of the rate of convergence parameter ν ¼ ln λ: 9 The

location of the regions in a geographical map is reported in Meliciani and Peracchi (2004).

214

V. Meliciani, F. Peracchi

Table 1 List of the European regions considered be2 be3 dk de1 de2 de5 de6 de7 de9 dea deb dec def gr1 gr2 gr3 gr4 es11 es12 es13 es21 es22 es23 es24 es3 es41 es42 es43 es51 es52 es53 es61

Vlaams Gewest Région Wallonne Denmark Baden-Waürttemberg Bayern Bremen Hamburg Hessen Niedersachsen Nordrhein-Westfalen Rheinland-Pfalz Saarland Schleswig-Holstein Voreia Ellada Kentriki Ellada Attiki Nisia Aigaiou, Kriti Galicia Principado de Asturias Cantabria Pais Vasco Comunidad de Navarra La Rioja Aragón Comunidad de Madrid Castilla y León Castilla-la Mancha Extremadura Cataluña Comunidad Valenciana Baleares Andalucia

es62 es63 es7 fr1 fr21 fr22 fr23 fr24 fr25 fr26 fr3 fr41 fr42 fr43 fr51 fr52 fr53 fr61 fr62 fr63 fr71 fr72 fr81 fr82 ie it11 it12 it13 it2 it31 it32 it33

Murcia Ceuta y Melilla Canarias Ile de France Champagne-Ardenne Picardie Haute-Normandie Centre Basse-Normandie Bourgogne Nord-Pas-de-Calais Lorraine Alsace Franche-Comté Pays de la Loire Bretagne Poitou-Charentes Aquitaine Midi-Pyrenees Limousin Rhóne-Alpes Auvergne Languedoc-Roussillon Prov-Alpes-Cóte Azur Ireland Piemonte Valle d’Aosta Liguria Lombardia Trentino-Alto Adige Veneto Friuli-Venezia Giulia

it4 it51 it52 it53 it6 it71 it72 it8 it91 it92 it93 ita itb lu nl1 nl2 nl3 nl4 pt11 pt12 pt13 pt14 pt15 uk2 uk3 uk4 uk6 uk7 uk9 uka ukb

Emilia-Romagna Toscana Umbria Marche Lazio Abruzzo Molise Campania Puglia Basilicata Calabria Sicilia Sardegna Luxembourg Noord-Nederland Oost-Nederland West-Nederland Zuid-Nederland Norte Centro Lisboa e Vale do Tejo Alentejo Algarve Yorkshire and Humberside East Midlands East Anglia South West West Midlands Wales Scotland Northern Ireland

The table shows the results obtained when the model is estimated under different assumptions on parameter heterogeneity. The cross-sectional estimates assume a common rate of convergence and a common steady-state level of per-capita GDP. Notice that only the parameter λ can be estimated in this case. Fixed-effects estimates allow for region-specific values of c but assume a common value of g and λ . Finally, heterogeneous panel models allow all three parameters to be regionspecific. In this case, we report both the conventional and the median unbiased estimates of the autoregressive parameter. For the other parameters (c and g) we report the GLS estimates based on these alternative estimates of λ: The rate of convergence ranges from a value of .016 for the cross-sectional case, to .13 for the fixed-effect estimates, to a mean value of .53 for the heterogeneous panel estimates based on the conventional estimates of λ: Our fixed-effects and heterogeneous panel estimates of the rate of convergence are much larger than those obtained by Lee et al. (1997) at the country level for a

Convergence in per-capita GDP across European regions: a reappraisal Table 2 Summary of parameter estimates ( ¼ c

215

ln ) g

Cross-section regression Fixed-effects estimates Mean 9.226 .027 Standard deviation .259 Minimum 8.687 Lower quartile 9.075 Median 9.216 Upper quartile 9.365 Maximum 9.966 Heterogeneous panel using ^ (conventional) Mean 9.353 .022 Standard deviation .314 .009 Minimum 8.543 −.002 Lower quartile 9.150 .015 Median 9.374 .020 Upper quartile 9.575 .025 Maximum 10.170 .054 Heterogeneous panel using ~ (median unbiased) Mean 9.261 .021 Standard deviation .284 .009 Minimum 8.556 .000 Lower quartile 9.088 .015 Median 9.358 .020 Upper quartile 9.462 .024 Maximum 9.618 .054

.984

.016

.880

.127

.623 .191 −.076 .533 .663 .745 .980

.527 .450 .021 .294 .409 .625 3.578

.879 .201 .034 .826 1.000 1.000 1.000

.185 .442 .000 .000 .000 .191 3.368

sample of 22 OECD countries. In fact, they obtain a value of .95 for λ (implying a value of .05 for the rate of convergence) when only allowing for heterogeneity in c , and a value of .76 for λ (implying a value of .27 for the rate of convergence) when allowing for complete heterogeneity (see their Tables 1 and 4). Our higher estimates could depend, in part, on the fact that our data refer to regions rather than countries. They may also be a consequence of the fact that the downward bias in the autoregressive coefficients (and therefore the upward bias in the rate of convergence) is larger for shorter time series (see Andrews 1993). Since our time series consists of 21 observations while the Lee, Pesaran and Smith time series consists of 29 observations, the upward bias in the rate of convergence should be larger for our estimates. In fact, the table shows that the mean rate of convergence falls from .53 to about .18 if we use median unbiased rather than conventional estimators of λ: Further, for more than half of the regions the median unbiased estimator of λ is equal to one, implying no convergence. Also note that the trend growth rate is higher for fixed effects estimates (0.027) than for heterogeneous panel estimates (0.022 and 0.021 for the results based respectively on the conventional and on the median unbiased estimates of λÞ:

216

V. Meliciani, F. Peracchi

Figure 3 is a map of Europe with the value of the estimates of the trend growth rate. Higher values of the estimates correspond to darker colors in the map. Looking at the map, there is evidence of both spatial and national effects in the distribution of the trend growth rate. The highest growth rates are found in all the Portuguese regions, several Spanish regions, Ireland, Luxembourg, the Greek Islands and two Italian regions (Trentino-Alto Adige, Veneto). The Spanish regions with the highest trend growth rate are Ceuta-y-Melilla, Canarias, Comunidad de Madrid, Extremadura, Cataluña, Aragón, Comunidad de Navarra, Balears, Murcia, Comunidad Valenciana, Castilla-la Mancha, Pais Vasco and Castilla-y-León. This group includes regions with per-capita incomes both above and below the national average. The UK regions appear to have intermediate trend growth rates, while the French regions tend to have below average growth rates. In general laggard countries, with the exception of Greece, appear to experience above-average mean growth rates. However, the same tendency does not appear to emerge across regions within the same country.10

Fig. 3 Estimates of the trend growth rate

10 The

results for the trend growth rate estimated using conventional estimates of do not differ much from the ones reported in the map (based on median unbiased estimates of ).

Convergence in per-capita GDP across European regions: a reappraisal

217

5.2 Spatial correlation The visual impression of spatial correlation in the trend growth rate may be investigated more formally. A popular indicator of spatial correlation is the Moran coefficient, defined as P P S 1 ni¼1;i6¼j nj¼1 wij ðxi x Þðxj x Þ ; I¼ P n 1 ni¼1 ðxi x Þ2 where xi is the value of the variable under consideration in region i , x denotes the average value of the variable across all regions, n is the total number of regions, wij denotes the generic P of an n n matrix of weights, called the contiguity P element matrix, and S ¼ ni¼1;i6¼j nj¼1 wij : The Moran coefficient takes the classic form of any autocorrelation coefficient: the numerator measures the covariance among the xi and the denominator measures the variance.11 Because the Moran coefficient is asymptotically normally distributed under some regularity conditions (see Cliff and Ord 1973, Chapter 1), inference on the significance of spatial correlation may be based on the standardized values of I: 12 The specification of the contiguity matrix is crucial for the Moran coefficient. We consider three different specifications. The first assigns a weight of one when two regions share the same border and a weight of zero otherwise. This matrix we be referred to as the “neighbor matrix”. To investigate to what extent spatial correlation might be due to country effects, we construct a “foreign neighbor matrix”, by considering only border regions and by assigning a weight of one when two regions belonging to two different countries share the same border and a weight of zero otherwise. We also consider a “country matrix” that assigns a weight of one when two regions belong to the same country and a weight of zero otherwise. These matrices are used to compute the amount of neighbor, foreign neighbor and country correlation in the trend growth rate. The value of the Moran coefficient changes little across estimation methods.13 The trend growth rate is highly correlated for regions belonging to the same country (the correlation coefficient is 0.60 for both conventional and median unbiased estimators). It is also highly correlated for neighboring regions, but correlation is lower across neighbors than across regions in the same country (for neighboring regions the correlation coefficient is 0.46 using conventional estimators and 0.44 using median unbiased estimators). Moreover, when we compute the Moran coefficient excluding regions belonging to the same country, the correlation is still positive but statically insignificant. This indicates the presence of important country effects in regional trend growth rates. In the neoclassical growth model, the trend growth rate g represents the rate of growth of 11 Rather

than imposing any a priori constraint on spatial correlation in the coefficients or the error term of the model, we prefer to allow for complete heterogeneity in the coefficients and arbitrary patterns of correlation in the residuals and to use the Moran coefficient as a descriptive tool that summarizes the spatial distribution of the estimated coefficients. 12 For the form of the asymptotic mean and standard deviation of I, see Cliff and Ord 1973. 13 We concentrate on the trend growth rate because for more than half of the regions the estimates of the autoregressive parameter are equal to one and the intercepts are not defined.

218

V. Meliciani, F. Peracchi

technology. Following this interpretation, it appears that, in spite of the further integration of European regions, the diffusion of technology remains faster within one country than across the borders. Overall the results show that, while there is little evidence of convergence to each region’s steady-state per-capita GDP,14 there is some evidence of catching-up since the estimates of the trend growth rate of most regions in some laggard countries (Spain and Portugal) are higher than the average. On the other hand, the fact that trend growth rates are similar for regions within the same country independently of their initial levels of per-capita GDP, is consistent with the lack of within-country convergence in levels found by many studies on regional growth in Europe (see e.g. Boldrin and Canova 2000). 5.3 Testing for parameter heterogeneity As already discussed, using conventional estimates of the autoregressive coefficient could lead to reject the null hypothesis more frequently than the nominal size of the test. Here we compare the results obtained using GLS estimators based on alternative estimates of the autoregressive parameter λ . In either case, we compare the results obtained not taking and taking into account the contemporaneous cross-sectional correlation in the innovations. The number of pairwise tests of equality of the trend slope g is equal to nðn 1Þ=2 ¼ 95ð94Þ=2 ¼ 4; 465 . For the intercept c , the number of pairwise tests of equality depends instead on the number of regions for which the estimated value of λ is less than one in absolute value, as the parameter is only identified in this case. Since this number is rather small, we focus on tests of homogeneity in trend growth rates. The amount of heterogeneity in the estimated trend growth rates is significantly reduced when the GLS transformation is carried out using the median unbiased estimates rather than the conventional one. This is true independently of whether or not we also allow for contemporaneous correlation in the innovations. Ignoring the contemporaneous correlation (GLS) and using conventional estimates of λ; equality in g is rejected at the 5% level in 54.5% of the cases, and at the 10% level in 62.0% of the cases. Using median unbiased estimates of λ; equality in g is instead rejected at the 5% level in only 18.5% of the cases and at the 10% level in only 25.4% of the cases. When taking into account the contemporaneous crosssection correlation in the innovations (SURE), rejection rates at the 5% (10%) level go up to 64.4% (70.7%) if conventional estimates of λ are used, and to 25.9% (34.5%) if median unbiased estimates are used. Figure 4 reports in more detail the results of our pairwise tests of equality of the trend growth rate based on SURE estimates that allow for contemporaneous correlation in the innovations across regions.15 The top and bottom panels correspond, respectively, to the conventional and median unbiased estimates of λ: Each point of a panel represents a pair of regions. The symbol “x” indicates 14 This

result can be interpreted as evidence against decreasing marginal productivity of capital within the Solow growth model, but is also consistent with a unit root in technology in the stochastic version of the model. 15 A similar figure obtained ignoring cross-sectional correlation in the innovations is reported in Meliciani and Peracchi (2004).

220

V. Meliciani, F. Peracchi

rates. Due to the symmetry of each matrix, we have drawn the results of the tests only for the part below the diagonal. Again we can observe that there are important country effects. In particular, on the basis of conventional t-tests most of growth homogeneity is found across regions belonging to the same country. On the other hand the trend growth rate of most French regions appears to be significantly different from the trend growth rate of most Spanish and Portuguese regions and from Luxembourg and Ireland. On the basis of corrected t -tests growth heterogeneity occurs in few cases (mostly involving Ireland, Luxembourg and some Portuguese, Italian, Spanish and French regions). Finally, we investigate the relevance of taking into account the cross-sectional autocorrelation in the disturbances using the Breusch and Pagan (1980) test statistic. Since we are carrying out pairwise comparisons, the test statistic is simply equal to T R^ 2ij , where R^ 2ij is the sample correlation between the GLS residuals from the i th and the j th region. Cross-sectional correlation in the innovations is statistically significant at the 5% level in 36.7% of the cases when using conventional estimates of λ; and in 33.5% of the cases when using median unbiased estimates of λ. The cases of no autocorrelation prevail in the UK, Portugal and Greece, suggesting that these countries have experienced shocks which are different from the rest of the EU.16 The large number of cases of significant autocorrelation across regions (also belonging to different countries) suggests the importance of taking into account the covariance in the innovations when testing for equality in the parameters.

6 Conclusions This paper analyzes convergence in per-capita GDP across European regions using a very standard model (a deterministic linear trend model with AR(1) errors) but trying to overcome some of the problems arising with previous empirical studies that have ignored the regional heterogeneity in the model parameters and the short time series dimension of the available data. Heterogeneity in the model parameters has been addressed using heterogeneous panel estimators instead of more restrictive “Barro regressions” or fixed-effects estimators, whereas the issues arising from the short time series dimension of the data have been addressed by using median unbiased estimators of the autoregressive parameter in the model. Our Monte Carlo simulations show that, for values of the autoregressive parameter commonly found in convergence studies, the larger sampling variability of median unbiased estimators relative to conventional estimators is more than compensated by the smaller bias, resulting in a sampling distribution that is more concentrated about the target parameter. We find that, for more than half of the European regions considered, the value of the median unbiased estimator is equal to one, implying no convergence to a steady-state level of per-capita GDP. The mean rate of convergence across regions using median unbiased estimators is about .18, less than half the value found using 16 For

a visual inspection of the patterns of correlation in the innovations across regions see Meliciani and Peracchi (2004), Fig. 9.

Convergence in per-capita GDP across European regions: a reappraisal

221

conventional estimators. These results suggest that there are serious problems in estimating the rate of convergence from short time series without properly taking into account the downward bias in the conventional estimates of the autoregressive parameter. Conventional t-tests on the parameters of the linear trend in the model would also lead to reject the null hypothesis of equality with a probability that is much larger than the nominal size of the test. Moreover, the discrepancy between the actual and the nominal size increases with the value of the autoregressive parameter. To address this problem we have carried out t-tests on the parameters of the linear trend replacing the conventional estimates of λ with median unbiased ones. To test hypotheses on the equality of the parameters across regions we have also taken into account the cross-sectional dependence in the error term. While tests based on conventional estimates of λ reject growth homogeneity in a majority of cases, tests based on median unbiased estimates of λ lead to the conclusion that regional trend growth rates differ in a minority of cases. Further, by allowing all parameters to differ across regions, this study also reveals strong spatial patterns of correlation in the trend growth rates. We find that, despite the increasing integration among European regions, trend growth rates are still highly correlated between regions belonging to the same country. If the trend growth rate captures the rate of growth of technology, as suggested by the neoclassical growth model, it appears that the diffusion of technology is still easier within one country than across the borders. Acknowledgements We thank Badi Baltagi, Michele Boldrin, David Levine, Hashem Pesaran and Melvyn Weeks for useful comments on an earlier draft of this paper. Financial support from CNR and MIUR is gratefully acknowledged.

References Andrews DWK (1993) Exactly median unbiased estimation of first order autoregressive/unit root models. Econometrica 61:139–165 Andrews DWK, Chen HY (1994) Approximately median unbiased estimation of autoregressive models. J Bus Econ Stat 12:187–414 Armstrong HW (1995) Convergence among regions of the European Union 1950–1990. Pap Reg Sci 74:143–152 Barro RJ, Sala-i-Martin X (1991) Convergence across states and regions. Brookings Pap Econ Act 137–158 Bernard AB, Durlauf SN (1995) Convergence in international output. J Appl Econ 10:97–108 Bernard AB, Durlauf SN (1996) Interpreting tests of the convergence hypothesis. J Econom 71:161–173 Boldrin M, Canova F (2000) Inequality and convergence: reconsidering European regional policies. Econ Policy 32:207–253 Breusch TV, Pagan AR (1980) The LM test and its applications to model specification in econometrics. Rev Econ Stud 47:239–254 Canjels E, Watson MW (1997) Estimating deterministic trends in the presence of serially correlated errors. Rev Econ Stat 79:184–200 Canova F, Marcet A (1995) The poor stay poor: non-convergence across countries and regions. CEPR Discussion Paper n. 1265 Chipman JS (1979) Efficiency of least squares estimation of linear trend when residuals are autocorrelated. Econometrica 47:115–128 Cliff AD, Ord JK (1973) Spatial autocorrelation. Pion Limited, London Cochrane D, Orcutt GH (1949) Application of least squares regression to relationships containing autocorrelated error terms. J Am Stat Assoc 43:32–61

222

V. Meliciani, F. Peracchi

de la Fuente A (1996) On the sources of convergence: a close look at the Spanish regions. CEPR Discussion Paper n. 1543 de la Fuente A (1998) What kind of regional convergence?. CEPR Discussion Paper n. 1924 Dufour JM (1990) Exact tests and confidence sets in linear regressions with autocorrelated errors. Econometrica 59:475–494 Islam N (1995) Growth empirics: a panel data approach. Q J Econ 110:1127–1170 Lee K, Pesaran MH, Smith R (1997) Growth and convergence in a multi-country empirical stochastic Solow model. J Appl Econ 12:357–392 Meliciani V, Peracchi F (2004) Convergence in per-capita GDP across European regions: a reappraisal. CEIS Working Paper n. 204 Neven D, Gouyette C (1995) Regional convergence in the European Community. J Common Mark Stud 33:47–65 Orcutt GH, Winokur HS (1969) First order autoregressions: inference, estimation and prediction. Econometrica 37:1–14 Paci R (1997) More similar and less equal: economic growth in the European regions. Weltwirtsch Arch 133:609–633 Park RE, Mitchell BM (1980) Estimating the autocorrelated error model with trended data. J Econom 13:185–201 Phillips PCB, Sul D (2003) Dynamic panel estimation and homogeneity testing under cross section dependence. Econ J 6:217–259 Quah D (1996) Regional convergence clusters across Europe. Eur Econ Rev 40:286–952 Quenouille MH (1956) Notes on bias in estimation. Biometrika 43:353–360 Rodríguez-Pose A (1998) The dynamics of regional growth in Europe. Social and political factors. Clarendon Press, Oxford Sala-i-Martin X (1996) Regional cohesion: evidence and theories of regional growth and convergence. Eur Econ Rev 40:1325–1352 Solow R (1956) A contribution to the theory of economic growth. Q J Econ 70:65–94

Locational choice and price competition: some empirical results for the austrian retail gasoline market Gerhard Clemenz . Klaus Gugler

Abstract Using data from the Austrian retail gasoline market we find that a higher station density reduces average prices. Market (i.e. ownership) concentration does not significantly affect average price, however is negatively related to the density of stations. Estimation of the pricing and entry equations as simultaneous equations does not alter our conclusions, and suggests causality running from station density to price. We argue that the spatial dimension of markets allows the identification of market conduct, which is particularly relevant for competition policy. Keywords Spatial competition . Retail gasoline . Pricing regressions JEL Classification L1 . L13 . L81 1 Introduction The purpose of the paper is twofold: Firstly, we want to test empirically for the Austrian gasoline market two hypotheses derived from models of spatial competition concerned with the relationships between population density, density of outlets, and prices. Secondly, we want to show how these results can be used to determine whether there is price competition or collusion in a market in which the location of suppliers plays an important role. The second purpose is particularly interesting in view of the fact that the European Commission has recently widened the concept of dominance by including joint or collective dominance in merger and antitrust analysis. To judge whether firms compete with each other or whether they collude, competition authorities need to have an appropriate notion of “competition”. That is, to decide whether firms behave anti-competitively, they need to have a benchmark model G. Clemenz (*) . K. Gugler Department of Economics, University of Vienna, Bruennerstrasse 72, 1210 Vienna, Austria E-mail: [email protected], [email protected]

224

G. Clemenz and K. Gugler

against which to compare actual market conduct. The textbook model of perfect competition where price equals marginal cost in equilibrium is particularly inappropriate for markets characterized by large fixed or sunk entry and exit costs, as e.g. in the retail gasoline market. We argue that the spatial dimension of markets allows one to identify possible dominance by a firm or group of firms. The retail gasoline market is characterized by a strong spatial dimension a feature which can be used to identify (anti-) competitive behavior. In particular, provided there is competition between stations, the nearer they are next to each other, on average, the lower should be the equilibrium price they can charge. The alternative (collusion) hypothesis would be no or even a positive relation between station density and price. No systematic relation between station density and price is expected if stations collude in price setting so that they effectively eliminate competition between them. A positive relation between station density and price might even result from facilitated collusion if stations are nearer to each other (e.g. if detection lags of deviant behavior are shorter), and/or if higher station density enables station operators to collectively better siphon off the additional consumer surplus that is generated by lower consumer transport costs. Thus, if one explicitly recognizes the spatial dimension of markets, identification of market conduct is possible.1 Moreover, directly utilizing the spatial dimension of markets to identify market conduct obviates the need to use market concentration–price relations, which suffer from problems of reverse causality and endogeneity. Building on the seminal paper of Hotelling (1929) a large number of theoretical models of spatial competition have been analyzed.2 Though the papers differ considerably with respect to their scope and purpose it seems fair to say that the following two questions are among the core issues of spatial economics: (i) What determines the equilibrium pattern of locations of firms? (ii) What are the properties of the equilibrium prices if there is spatial competition between firms? Not surprisingly, different models come up with different results, depending on their main focus, but at times also on rather subtle differences in their assumptions. However, the following two hypotheses are supported by, or at least compatible with the vast majority of theoretical models: Hypothesis 1 With free entry retail shops tend to be more densely located in areas with a higher population density. Hypothesis 2 With spatial competition, equilibrium prices tend to be lower the higher the density of seller locations is. Hypothesis 2 has an obvious consequence for Hypothesis 1: With spatial competition the increase in the density of shops must be less than proportional to the increase in population density since a higher station density reduces the equilibrium price. 1 Guidelines to identify market conduct are particularly relevant given the strict theoretical and data requirements for detecting collusive behavior in other models of competition (see Phlips 1995). 2 For surveys see e.g. Anderson et al. (1992); Beath and Katsoulacos (1991); Beckmann and Thisse (1986); Martin (1993); Tirole (1988).

Locational choice and price competition

225

As far as hypothesis 1 is concerned two remarks should be made: Firstly, a positive correlation between station density and population density is also compatible with collusive behaviour. A pure monopolist would also increase the number of outlets if the number of consumers increases, though she would run fewer outlets and increase their number by less than would be observed in a competitive market. Our data do not, however, allow for discrimination between competition and a lack of it with respect to the choice of locations. Secondly, other oligopoly models are also compatible with the observation that the number of firms is increasing in the number of consumers, e.g. the Cournot model. In a Bertrand model or in a pure monopoly with a homogenous good, however, such a relationship would not exist. Considering that petrol is almost homogenous at the least such an observation would underline the importance of the spatial aspect in the retail market for petrol. The retail gasoline market appears to be particularly apt for testing predictions of spatial economics for the following reasons.3 (i) Gasoline can be considered as an almost perfectly homogenous good with respect to its physical and chemical properties. (ii) As a consequence, gasoline stations are engaged in direct competition almost entirely only with their immediate neighbors, which agrees with most models of spatial competition.4 (iii) Gasoline stations cause substantial entry and exit costs, and frequently used two stage models with the choice of location in the first stage and (price) competition in the second stage capture quite well some of the crucial features of the retail gasoline market. (iv) Last, but not least, relevant data are available, particularly because prices are quite transparent and well documented. In spite of this, to the best of our knowledge this is the first empirical test of the two above mentioned hypotheses resulting from models of spatial competition for the retail gasoline market. There exists, however, a fair number of empirical studies of the gasoline market, though their focus is different from that of this paper. Several authors have addressed the question whether recent game theoretic models are compatible with observed price movements in gasoline markets, most notably M. Slade (1987, 1992); Castania and Johnson (1993) or Borenstein and Shepard (1996). Spatial competition, however, is not a main concern in these papers. Borenstein’s (1991) focus is on the determinants of margin differences between leaded and unleaded gasoline. Others have used data from gasoline markets to assess the impact of policy measures or of certain contractual arrangements on gasoline prices (Anderson and Johnson 1999; Johnson and Romeo 2000; Shepard 1993). An interesting line of research concerns the choice of contract between gas stations and their suppliers (Slade 1996; 1998). Finally, the demand for gasoline

3 A more detailed description of the structure of a retail gasoline market can be found in von Weizsäcker (2002). 4 For a recent test of the spatial dimension of competition, see Pinkse et al. (2001). They conclude that competition in the wholesale gasoline market is highly localized. It appears that competition in the retail gasoline market is even more likely to be localized.

226

G. Clemenz and K. Gugler

has been estimated by several authors (Schmalensee and Stoker 1999; Baltagi and Griffin 1997). Considine (2001) analyses an upstream market, petroleum refining.5 We show that both of the above hypotheses are very well supported by the data. Using the 121 political districts of Austria as regional units we find that population density explains more than 95% of the cross-district variation in the density of gasoline stations. As far as the relationship between prices and the density of gas stations is concerned we find in all specifications that the coefficient has the predicted negative sign and is significant at the 5% level or better. Market (ownership) concentration does not have a clear-cut relation to price. Moreover, we do not obtain different results when we estimate a simultaneous equations system, nor when we choose different regional units. The plan of the paper is as follows. In the next section we give a brief outline of the theoretical rationale for the two hypotheses we are going to test. In Section 3 we describe the data basis, and in Section 4 we present our empirical results. Section 5 concludes. 2 Theoretical background Probably the most well known model of spatial competition is the circle model of Salop (1979).6 This model has been modified in a number of ways. Capozza and Van Order (1980) have made the distinction between immobile and portable firms, and Eaton and Wooders (1985) have analysed equilibria in models where relocation is prohibitively costly. The analysis becomes rather involved, and in particular the equilibrium cannot be expected to be unique (if one exists at all), or to require zero profits. In what follows therefore we focus on a description of those aspects of the Salop (1979) model, which are relevant for our purpose (see the Appendix for more details). A crucial feature of pure spatial competition is that each consumer buys at that shop where total costs, consisting of price (times quantity) plus any transport costs she has to incur are smallest. Consequently, each shop has a “local monopoly” whose geographical size depends on the prices charged by the nearest competitors and the transport costs consumers have to incur at different shops in a given area. The latter depend to a large extent on the distances between different shops, but also on the quality of the roads, the availability of public transport, etc. Clearly, the price a shop can charge is increasing in the distance from the nearest competitors and in the transport costs of consumers. The demand such a local monopoly is facing does not only depend on the geographical size of the market, but on the total number of consumers in that area and therefore, for a given area, on the population density, denoted as D. When choosing a location a firm wants to be where many consumers are, but only few competitors. If there are no entry restrictions firms will establish outlets in a region as long as the setup costs are smaller than the expected profits. In a more densely populated region firms can 5 Bresnahan and Reiss (1990, 1991) focus on how the number of firms in a market relates to market size, and thereby infer how market power relates to the number firms. There are a few empirical studies on spatial aspects of competition for other markets (Asplund and Sandin 1999; Claycombe and Mahan 1993; Fik 1988), whose focus, however, is different from ours. In particular, locational choice is not part of these investigations. 6 See also chapter 6 of Anderson et al. (1992).

Locational choice and price competition

227

locate closer to each other than in thinly populated regions because demand per square kilometer is greater. However, the number of shops will increase less than proportionally to the population density since the greater proximity of shops will reduce the equilibrium price. In reality, additional factors may affect the location decisions of firms. Most obviously, it is not just the number of consumers, but also the demand per consumer, which determines the expected profit per shop. The per capita demand, in turn, depends on the per capita income and, as far as the demand for gasoline is concerned, the number of cars per capita, denoted as V. Another complication arises from the fact that the simplifying assumption that each firm has only one location is certainly not true for the retail gasoline market. Unfortunately, there is no straightforward answer to the question of how market (i.e. ownership) concentration will affect the density of shops. It seems safe to say that the number of outlets a pure monopolist without entry threat will run is the lower bound. Conversely, the upper bound for the number of shops is given by the number of locations a monopolist will set up if there is free entry.7 Beyond that we have no clear prediction concerning the relationship between market concentration and density of gasoline stations. Finally, it is worth mentioning that even the retail gasoline market does not fully conform to pure spatial competition. Some consumers have a preference for particular brands, and gas stations compete not only via prices, but also by offering special services, running shops, etc. It is hard to tell, however, how effective these additional strategic variables actually are, and as far as our empirical analysis is concerned we do not have reliable data to test their impact. In addition to the variables already defined above we use the following notation: C is some measure of market concentration, T are consumer transport costs, and S is the density of shops. The above discussion on the determinants of the density of gasoline stations can be summarized by the following equation and partial derivatives S ¼ S ðD; V ; T ; C; . . .Þ; @S=@D > 0; @2 S=@D2 < 0; @S=@V > 0; @2 S=@V 2 < 0; @S=@T > 0; @S=@C ¼ ?

(1)

That is, we expect the demand variables D and V to positively affect station density. Since larger station density implies increased competition and thus lower equilibrium prices, S is expected to be increasing in D and V, though at a decreasing rate. Station density is also increasing in consumer transport costs T. The question mark for the partial derivative with respect to market concentration C captures the ambiguity of predictions. Consider next the equilibrium price for given locations of shops. As argued above, with spatial competition prices can be expected to be increasing in the distances between shops and increasing in the transport costs of consumers. In our empirical analysis we use S, the density of shops, as a (inverse) proxy for these

7 A monopolist who wants to prevent entry will set up more outlets than would result from free entry with single location firms since the monopolist can charge a higher than the competitive price as long as no entry occurs, and a higher density of outlets makes her “tougher” if a competitor enters the market.

228

G. Clemenz and K. Gugler

distances. Furthermore, equilibrium prices are increasing in marginal costs, denoted as c. An interesting question concerns the impact of market concentration on the retail price. We would expect prices to be increasing in the degree of concentration for at least two reasons: a) If a firm is able to set up a cluster of outlets such that some of her shops have only shops run by herself as nearest “competitors” then these shops are protected from outside competition and can charge a higher price than with pure spatial competition. b) In highly concentrated markets tacit collusion is more likely to occur than in markets with many competitors. However, concentration may be endogenously determined and simply proxy for the efficiency of multi-branch firms leading to lower retail prices.8 Thus, we do not make strong predictions as to the effects of concentration. To sum up, the price equation and partial derivatives can be written as P ¼ PðS; T ; c; C; . . .Þ @P=@S < 0; @P=@T > 0; @P=@c > 0; @P=@C ¼ ?

(2)

With spatial competition, we expect a higher station density S to reduce equilibrium price. As already mentioned, the alternative hypothesis would be no or even a positive relation between station density and price, if stations collude in price setting so that they effectively eliminate competition between them. Larger consumer transport costs T and larger marginal costs c increase price. Expectations are ambiguous concerning the effects of market concentration C on price. In (1) and (2) we have assumed that entry decisions precede price competition, that is that station density is a predetermined variable with respect to price. We will, however, test whether S and P are simultaneously determined by estimating (1) and (2) as simultaneous equations below. 3 The data To test the predictions of spatial competition as outlined in Section 2, we first assembled a comprehensive list of gasoline stations in Austria as of the beginning of 2001. Unfortunately, there does not exist a comprehensive list of stations from a single source, therefore we had to construct a list from the sources Statistik Austria (Austrian Statistical Office), the ÖAMTC (an Austrian automobile club), and information provided by the petroleum companies (in the order of their market shares) OMV AG, BP Austria AG, SHELL, ESSO, AGIP and ARAL. Thus, we could localize 2,856 gasoline stations in Austria by address (zip code and address). Additionally, we know the name of the oil company operating the stations or whether the station is operated by an independent retailer. According to the Fachverband der Mineralölindustrie (Association of the Petroleum Industry in 8 See

Weiss (1989) for a survey of concentration–price studies. See Barros (1999) for a model on multi-branch firms and evidence on Portugal.

Locational choice and price competition

229

Austria), there were 2,957 operating gasoline stations in Austria as of the beginning of 2001, thus our list covers 96.6% of all gasoline stations in Austria. We use the number of gasoline stations rather than output or sales as the basis to calculate concentration figures. This has the advantage that our measures of concentration are less subject to the kind of endogeneity problems mentioned by Evans et al. (1993).9 For 1,603 (54.2%) gasoline stations operated by the firms OMVAG, BP Austria AG, SHELL, AGIP and ARAL we obtained retail price information on a daily basis for the period 1 November 2000 until 30 March 2001 for the gasoline brand EUROSUPER (unleaded gasoline containing 95 octane), which is the most important brand in Austria. This implies that we do not have price information on independent retailers. We include, however, the percentage of stations operated by independent retailers in the pricing regressions presented in Section 4 as an additional control. A rather tricky problem is the delineation of local gasoline markets and the definition of “regions”. Austria consists of nine federal states subdivided into 121 districts, which consist of roughly 2,400 municipalities (i.e. zip-code level). We use the districts as relevant regions. This choice compromises on the market definition being too narrow (as is probably the case if we take zip codes or the like as our region) or too wide (if we took e.g. federal states).10 Note, however, to the extent that we measure the relevant market inaccurately, our estimates are likely to underestimate the true relationships. Unless the inaccuracy is correlated with our variables of interest, the most likely effect is increased white noise reducing statistical significance. In any case, we present robustness tests using the narrow market definition at the zip-code level. For each of the 121 districts, we calculate the variables as defined in Table 1. The dependent variables are margin M and the density of gasoline stations S in a particular district, with M=P−c. P is the daily retail price charged for EUROSUPER net of all taxes (a 20% sales tax and a gasoline quantity tax of 5.61 ATS/l) in ATS per liter averaged over the period 1 November 2000 and 31 March 2001 and averaged over all stations within a district. To obtain estimates of marginal cost c we utilize information on PLATT product notations in Amsterdam. The market in Amsterdam and more generally the “ARA area” (Amsterdam–Rotterdam–Antwerp) is the most important spot market determining gasoline prices in Europe. More than 14% of European refinery capacity and most of European petroleum imports are located in this area (Puwein and Wüger 1999). Our strategy to proxy marginal costs for Austrian gasoline stations is therefore to apply a limit pricing argument in that marginal costs are equal to these PLATT prices plus transportation costs (to and within Austria) and variable remuneration of gasoline operators. Specifically, marginal cost c is proxied by the sum of (1) the average daily PLATT price of EUROSUPER in Rotterdam over the period 1 November 2000 and 31 March 2001 converted to ATS from USD using daily exchange rates (which 9 Concentration–price

regressions suffer mainly from two sources of bias: first, concentration normally is a function of endogenous firm outputs or revenues. Second, performance feeds back into market structure, that is concentration causes price, but price also causes concentration. Using the number of gasoline stations as the basis of our concentration measures should reduce the first bias. 10 Defining the relevant market is beyond the scope of this paper. See Slade (1986) for such an attempt.

230

G. Clemenz and K. Gugler

Table 1 Variable definitions and data sources Variable

Definitions

Source(s)

Popk Ak Fk

Number of inhabitants in district k. Area of district k in square kilometers. Number of firms operating gasoline stations in district k as of beginning of 2001. Number of gasoline stations in district k as of beginning of 2001. Retail price charged for EUROSUPER (unleaded gasoline with 95 octane) (total of 1,603 gasoline sta‐ tions) net of all taxes per liter averaged over the period 1 November 2000 and 31 March 2001 and averaged over all stations within district k in ATS* per liter, i.e. Nk P T P Pi;t , where T=151, the number of Pk ¼ TN1 k

SA SA SA; ÖAMTC; “Majors” SA; ÖAMTC; “Majors” “Majors” without ESSO; Puwein und Wüger (1999); FV.

Nk Pk

i¼1 t¼1

Mk = Pk−c

Sk = Nk/Ak Dk = Popk/Ak Clk

C4k

days between 1 November 2000 and 31 March 2001. Difference between Pk and marginal cost in ATS* per liter. Marginal cost c is proxied by the sum of (1) the average daily PLATT product notations of EURO‐ SUPER in Rotterdam over the period 1 November 2000 and 31 March 2001 (2) estimates of transporta‐ tion to Austria per liter (3) estimates of distribution costs within Austria per liter and (4) estimates of the per liter remuneration of gasoline operators. Density of gasoline stations in district k.

Population density in district k. Market share of the largest firm in district k defined as C1k ¼ NN1;kk , where N1,k is the number of gasoline

SA; ÖAMTC; “Majors” SA SA; ÖAMTC; “Majors”

stations operated by the largest firm in district k. Sum of market shares of the largest four firms in district k, 4 P

SA; ÖAMTC; “Majors”

stations operated by the n largest firm in district k. Sum of squared market shares of all firms in district 2 Fk P Nn;k . k, HERFk ¼ Nk

SA; ÖAMTC; “Majors”

Nn;k

C4k ¼

HERFk

“Majors” without ESSO; Puwein und Wüger (1999); FV.

n¼1

Nk

, where Nn,k is the number of gasoline

n¼1

INDEPENDENTk Share of gasoline stations operated by independent retailers in district k. Degree of motorization defined as the number of Vk motor-operated vehicles per head in district k. ALPSk Share of alps and woods of total area in district k.

SA; ÖAMTC; “Majors” SA SA

SA Statistik Austria (Austrian Statistical Office) FV ... Fachverband der Mineralölindustrie (Association of the petroleum industry) *13.76 ATS = 1 EURO **The largest six Austrian oil companies are often called “majors” (i.e. OMV AG, BP Austria AG, SHELL, ESSO, ARAL and AGIP)

Locational choice and price competition

231

equaled 3.01 ATS/l), (2) estimates of transportation costs to Austria per liter (0.20 ATS/l; Source.: Puwein and Wüger 1999), (3) estimates of distribution costs within Austria per liter (0.10 ATS/l; Source: Puwein and Wüger 1999), and (4) estimates of the per liter remuneration of service station operators (0.30 ATS/l, Source: Fachverband der Mineralölindustrie). Therefore, we estimate marginal costs c at 3.61 ATS/l over the period of analysis. This strikes us to be the most plausible estimate of marginal costs. We experimented with a number of values ranging from 3 to 4 ATS/l, however the results for the margin equation in Section 4 are virtually the same. Several additional arguments defend our approach. First, the whole Austrian territory can be supplied by three refineries: Schwechat, Mestre, and Ingolstadt, with more than 60% of total supply stemming from Schwechat. There is a product pipeline in Austria, transporting the overwhelming bulk of gasoline. Thus, there is not much variation in the production and distribution technology of wholesale supply of gasoline. Second, as Puwein and Wüger (1999) note transportation costs within Austria are a minor component of marginal costs. Thus, it is likely that marginal costs of gasoline do not vary substantially at the station across Austria. Nevertheless, we include federal state and/or district dummies in the margin equations estimated below. Fixed federal state or district effects may arise due to differing distribution and remuneration costs and thus differing marginal costs within Austria. Figure 1 displays the evolution of average P (net of all taxes) in Austria and the PLATT notations for EUROSUPER as well as BRENT crude oil in Rotterdam. As can be seen, retail prices first decrease until around mid of January 2001 increase until mid of February and then remain roughly constant. PLATT notations are a bit more volatile than retail prices in Austria (coefficient of variation of 0.10 for

Fig. 1 Average EUROSUPER retail price in Austria and PLATT ’s notations

232

G. Clemenz and K. Gugler

EUROSUPER and 0.15 for BRENT versus 0.07 for average retail prices in Austria). Therefore, we are confident that the time period is long enough and the turbulence in the markets was sufficiently low so that we capture structural differences in M across districts and not merely short-run disequilibrium phenomena. Table 1 presents detailed definitions of the variables used in the subsequent regression analysis. Table 2 presents summary statistics. On average, districts extend to around 700 km2 with nearly 70,000 inhabitants. An average of 5.6 firms operate 23.7 gasoline stations per district. The mean before tax price of a liter of EUROSUPER was 5.07 ATS with a quite sizeable range of 4.66 to 5.40 across districts. The average margin is 1.46 ATS. On average, the patch of a service station is 31.6 km2(=1/S) and the median population density is 87.3 inhabitants per square kilometer. The largest firm on average operates more than a quarter of gasoline stations, average C4 is 65.1% and the average HERF is 16.1%. Around one third of gasoline stations are operated by independent marketers. The degree of motorization V varies considerably across districts with a mean of 0.72 motorized vehicles per person and a maximum of more than two. Nearly 40% of the area is alpine or covered with woods. 4 Results This section presents our results in two steps. First, we explain the density of gasoline stations. These regressions give insight into the determinants of entry into the Austrian retail market of gasoline. From Section 2 we hypothesize that the main determinants of the density of gasoline stations are population density and the degree of motorization as proxies of demand, and market concentration. Second, we present the results on the price equation. Here the main theoretical prediction is Table 2 Summary statistics on the district level Mean

Stand dev. Median

Max

Min

67,335 37,873 59,370 241,530 1,740 Popk (inhabitants) 703.7 629.5 669.1 3,270.1 1.5 Ak (in km2) 5.6 3.2 5.0 17.0 1.0 Fk (firms) Nk (stations) 23.7 15.5 21.0 96.0 1.0 5.07 0.14 5.08 5.40 4.66 Pk (in ATS) 1.46 0.14 1.47 1.79 1.05 Mk (in ATS) 31.6 26.8 29.2 113.3 0.3 1/Sk (km2/station) Dk (inhabitants/km2) 1,888.7 4,706.7 87.3 26,028.6 21.1 25.8% 10.2% 23.5% 100.0% 10.7% Clk (in %) 65.1% 13.3% 62.5% 100.0% 35.7% C4k (in %) 16.1% 10.0% 14.0% 100.0% 5.9% HERFk (in %) INDEPENDENTk (in %) 33.6% 13.8% 33.3% 87.5% 0.0% 0.72 0.20 0.73 2.24 0.37 Vk (number of motorvehicles/head) ALPSk (in %) 39.3% 24.3% 39.0% 80.8% 0.0% For definitions of variables, see Table 1

No of obs. 121 121 121 121 121 121 121 121 121 121 121 121 121 121

Locational choice and price competition

233

that the price is decreasing in station density (or increasing in the average distance between gasoline stations). Controls include the share of independent marketers and additional proxies of transport costs. 4.1 The density of gasoline stations From (1), gasoline station density is explained by variables proxying for demand and market structure11 ln Sk ¼ 0 þ 1 DEMANDk þ 2 Ck þ "k

(3)

where k=1,..., 121 denotes administrative districts in Austria; lnSk the (logarithm of the) number of gasoline stations per square kilometer in district k; DEMANDk = {lnDk, lnVk} the (logarithms of the) number of inhabitants per square kilometer in district k as well as the number of motorized vehicles per capita in district k; Ck = {lnClk or lnC4k or lnHERFk} the (logarithms of the) share of the largest, the largest four firms or the Herfindahl-index in district k; and "k an error term.12 Table 3 presents the results. As theory would predict population density virtually completely determines the density of gasoline stations. Population density explains more than 95% of the cross-district variation in the density of gasoline stations. Figure 2 shows that the fit is nearly perfect. The coefficient estimate of 0.81 (t= 41.10) implies that for each percentage increase in the number of inhabitants per square kilometer the number of gasoline stations increases by around 0.8% per square kilometer. This conforms to predictions of models of spatial competition that the number of outlets increases less than proportional to consumer density, since the greater proximity of shops reduces the equilibrium price. Equation 2 of Table 3 includes 0–1 dummies for federal states of which there are nine in Austria. We include federal state effects because entry conditions may differ across federal states due to differing regulations, e.g. concerning the environment, building regulation etc, which affect fixed entry and exit costs. Our estimates are robust to the inclusion of these dummies and the coefficient on lnD rises to 0.90 with a t-value of 17.77. The F-statistic indicates that fixed federal state effects are not significant at conventional levels thus we leave them out in Eqs. 3, 4, 5, 6, and 7. These tests show that differences across federal states are not large enough to significantly affect entry/exit decisions, what counts is population density. We will return to fixed federal state effects when we analyze the margin equation, however. Population density is fairly skewed across districts due to the presence of urban areas, most notably Vienna. It may be the case that entry decisions are influenced by quite different factors in cities than in the countryside e.g. by the availability of 11 We tried ALPS in Eq. 3 as a proxy for consumer transport costs T. Since this variable was k always insignificant and its inclusion never changed the results on the other variables, we do not report it. 12 Since we do not have quantity or sales data we cannot estimate a fully structural model and estimate reduced forms. However, if gasoline demand is fairly inelastic (which is likely to be true, see Puwein and Wüger (1999), estimating a demand elasticity of only 0.2), our demand proxies, population density and number of cars, are likely to capture variation in demand across submarkets accurately.

41.10

0.953 121

−7.014 −75.03 No

0.810

17.77

0.829

0.957 121

0.239 0.873 98

−7.481 −29.87 −7.096 Yes No

0.900

t-value Coef

3

−24.02

11.94

t-value

0.954 121

5

48.40 −0.89 −3.20

−0.613

−7.406 −59.05 No

51.63

0.957 121

6

0.958 121

7

−2.14

47.35

50.48

t-value

0.963 121

−0.465 −5.05 0.268 1.95 −8.148 −39.74 No

0.873

t-value Coef

−7.729 −22.81 No

−0.306

0.835

t-value Coef

0.832

t-value Coef

−7.229 −32.07 No

0.816 −0.132

Coef

4

Districts excluding Vienna All districts

Note: Estimation method is OLS with White (1980) heteroscedasticity consistent standard errors

lnDk lnClk lnC4k lnHERFk lnVk Constant Fixed federal state effects F-test of fixed effects (p value) adjusted R2 No Obs

2

t-value Coef

1

Equation

Independent variables Coef

All districts

Sample:

Dependent variable: lnSk

Table 3 The density equation, district level

234 G. Clemenz and K. Gugler

Locational choice and price competition

235

2

1

0

lnS

-1

-2

-3

-4

-5 2

3

4

5

6

7

8

9

10

11

lnD

Fig. 2 The relationship between population and gasoline station density

space etc. Therefore we test for the robustness of our results by excluding the 23 districts of Vienna. Equation 3 shows that results are unaltered and the influence of population density is virtually the same in Vienna than in other administrative districts. When we restrict the sample to those districts where population density is smaller than 500 inhabitants per square kilometer (and thus effectively restricting the sample to the 90 mostly rural districts), the coefficient rises to 0.90 (t=12.70). Thus, there is some evidence that entry decisions in rural areas depend even more on population density than entry decisions in more densely populated areas. Equations 4, 5, and 6 add our measures of market concentration to the estimating equation. Recall our measures of market concentration are based on the relative size of firms in the market as measured by the number of gasoline stations operated by them. The logarithm of the share of the largest firm lnC1 has the expected negative sign but is insignificant while a larger C4 and Herfindahl-index significantly reduce station density. Equation 7 adds the variable lnV, another proxy for demand, which takes on the expected positive sign and is marginally significant at the 5% level. We chose to present the results on the log–log specification (Eq. 3). It should be noted, however, that our results do not depend on the specific functional form chosen. We experimented with a number of different functional forms and specifications, e.g. the linear model, the linear model including squared terms, or explicitly estimating a power function by non-linear least squares. None of our results changes and the results from these regressions are available upon request. In particular, all estimations produce a similar concave relationship between S and D. This can be interpreted as an additional specification test of Eq. 1.

236

G. Clemenz and K. Gugler

4.2 The margin equation The second main prediction of models of spatial competition concerns the relationship between the price and therefore the margin that is charged and competition intensity as implied by the distance to the closest competitors: the farther away gasoline stations are from one another on average the higher will be the margin charged.13 Thus, we operationalize Eq. 2 and estimate ln Mk ¼ lnðP cÞk ¼ 0 þ 1 ln Sk þ 2 Ck þ 3 ALPSk þ 4 INDEPENDENTk þ k

(4)

where k=1,..., 121 again denotes administrative districts in Austria; lnMk = ln (P−c) the (logarithm of the) average price charged in district k minus our estimate of marginal cost; lnS the (logarithm of the) number of gasoline station per sqkm in district k. This is an inverse proxy of the average distance between gasoline stations. A larger value of S therefore indicates more intense competition, and we expect β1< 0 if spatial competition plays a role in the determination of margins. lnCk ={lnClk or lnC4k or lnHERFk} is the (logarithms of the) share of the largest, the largest four firms or the Herfindahl-index in district k; ALPSk the share of alps and woods of total area in district k as an additional proxy for differing transport costs across districts; and vk is an error term. As already mentioned, we do not have price data on independent retailers, but we include INDEPENDENTk, the share of independent marketers in district k. Table 4 presents the results for Eq. 4. In all specifications the coefficient on lnS is negative and significant at the 5% level or better indicating that the closer competitors on average are to each other the lower is the margin. The margin equations indicate that—contrary to the gasoline density equation before—fixed federal state effects are significant and explain a fair portion of the cross sectional variation in margins. The inclusion of these dummies does not render lnS insignificant, on the contrary, coefficients and significance levels rise. One explanation is that our measure of marginal cost which we assumed invariant across districts and thus federal states in fact varies across them, e.g. due to differing distribution and remuneration costs. The fixed federal states effects (partially) correct for this. Below we present robustness tests running the margin equation on the zip-code level and including 120 district dummies. Our main results hold up. Equations 1, 2, and 3 include (respectively) lnC1, lnC4 and lnHERF as explanatory variables, however, we do not detect a significant influence of market concentration on the margin at the district level. INDEPENDENT takes on negative signs, however, it is only significant when we restrict the sample to the 98 districts outside of Vienna (see Eq. 5). As we have seen in Section 4.1. gasoline station density in an area is determined by demand and cost conditions in a particular market. Equation 4 estimates Eq. 4 13 We

report the results on retail margins rather than markups as Borenstein (1991) does. However, results are similar if we take markup as the dependent variable in Eq. 4. We also experimented with a number of other explanatory variables such as the percentage of highway stations in a geographical market (expected positive effect) or whether a geographical market borders to an Eastern European country (expected negative effect). For these variables we generally find the predicted effects. These results are available upon request.

Locational choice and price competition

237

Table 4 The margin equation, district level Dependent variable: lnMk Sample:

All districts

Equation

1

2

3

4

Districts excluding Vienna 5

Method

OLS

OLS

OLS

2SLS

2SLS

Independent variables

Coef

t-value Coef

t-value Coef

t-value Coef

z-value Coef

lnSk

−0.036 −3.15

lnClk

−0.020 −0.54

−0.035 −2.99

lnC4

0.023

0.48

0.047

0.77

lnHERFk ALPSk INDEPENDENTk

0.054

0.89

−0.095 −1.47

Constant

0.299

Fixed federal state effects

yes

−0.064 −0.86

5.11

0.327

9.72

yes

−0.036 −3.10

−0.039 −3.68

−0.045

−3.90

−0.009 −0.32

−0.010 −0.34

−0.047

−1.38

0.054

0.90

−0.085 −1.16 0.307 yes

5.23

0.065

1.05

0.068

1.01

−0.087 −1.38

−0.207

−2.56

0.230

3.36

0.301 yes

5.16

yes

F-test of fixed effects (p value)

0.000

0.000

0.000

0.000

0.000

adjusted r2

0.413

0.414

0.408

0.433

0.472

No Obs

121

121

z-value

121

121

98

Estimation method below “OLS” is OLS with White (1980) heteroscedasticity consistent standard errors Estimation method below “2SLS” is the two-stage least squares within estimator due to Balestra and Varadharajan-Krishnakumar using lnDk as instrument for lnSk. r2 for 2SLS is defined as “r2 ”=1−RSS/TSS, where RSS is the residual sum of squares and TSS is the total sum of squared residuals about the mean of the dependent variable

by 2SLS instrumenting lnS by lnD. This appears to be an ideal instrument, since population density is exogenous to gasoline prices and—as shown in Section 4.1. — almost completely determines station density. The results do not change and if anything the influence of lnS is larger if we instrument it. We also performed Hausman tests, which showed that endogeneity is not a likely problem, since the coefficients obtained with the less efficient but consistent estimates are not systematically different from the fully efficient estimates, i.e. χ2(1)=0.57. As a final check against endogeneity, we shall estimate Eqs. 3 and 4 simultaneously below. ALPS, the area share of alps and woods as an additional proxy for transport costs, takes on the right signs, however it is not significant. One explanation is that S is highly correlated with ALPS (correlation coefficient of 0.72) and S is the dominant force explaining margins. This is confirmed by the fact that when we exclude lnS, ALPS takes on positive and significant coefficients. 4.3 Additional robustness tests 4.3.1 The relevant geographical market Until now we assumed that districts are accurate in defining the relevant region for gasoline stations. We now test whether our results are changed if we narrow our

238

G. Clemenz and K. Gugler

Table 5 Robustness Panel A: The margin equation, zip-code level (z) Dependent variable: lnMz Sample:

All zipcodes

Zipcodes excluding Vienna

Equation

(1)

(2)

(3)

(4)

Method

OLS

OLS

OLS

2SLS

2SLS

Independent variables

Coef

t-value Coef

Coef

z-value Coef

lnSz lnClz

−0.007 −2.90

t-value

−0.007

−2.95

0.125

2.97

2.92

0.062

2.97

−0.044 −2.07

−0.037

−1.69 9.34

0.075

lnC4z

INDEPENDENTz

0.061

t-value

−0.006 −2.84

−0.027 −3.09

Constant

0.333 16.03

0.272

Fixed federal state effects

yes

yes

−0.043

−3.30

2.19

0.032

2.00

0.125

3.77

0.045

0.062

2.97

0.021

1.84

0.059

1.89

−0.010 −1.41

−0.202

−2.32

0.235

3.11

−0.041 −1.88 0.416

11.95

yes

0.557 yes

7.82

yes

F-test of fixed effects (p value)

0.000

0.000

0.000

0.000

0.000

adjusted R2

0.261

0.267

0.255

0.245

0.280

No Obs

z-value

1.04

lnHERFz ALPSz

Coef

(5)

803

803

803

803

780

Panel B: The density and the margin equation as simultaneous equations, district level (k) Dependent variables

lnSk

Independent variables

Coef

lnMk z-value

lnSk lnMk lnDk lnHERFk lnVk

0.207 0.876

38.77

−0.350

−3.43

0.247

1.58

ALPSk −8.02

Fixed federal state effects

−0.031

−3.93

−0.006

−0.21

0.47

INDEPENDENTk Constant

−23.92

0.051

1.12

−0.080

−1.37

0.395

No

No Obs

8.54 Yes

F-test of fixed effects (p value) “r2”

z-value

Coef

0.000 0.962 121

0.466 121

Note A: Estimation method below “OLS” is OLS with White (1980) heteroscedasticity consistent standard errors Estimation method below “2SLS” is the two-stage least squares within estimator due to Balestra and Varadharajan-Krishnakumar using lnDz as instrument for lnSz. R2 for 2SLS is defined as “R2 ”=1−RSS/TSS, where RSS is the residual sum of squares and TSS is the total sum of squared residuals about the mean of the dependent variable Note B: Estimation method is 3SLS with exogenous variables (the instrument list) lnDk, lnHEFk, lnVk, INDEPENDENTk, ALPSk, and eight federal state dummies. “r2 ” is defined as “r2 ”=1−RSS/ TSS, where RSS is the residual sum of squares and TSS is the total sum of squared residuals about the mean of the dependent variable

Locational choice and price competition

239

definition of the relevant region. Panel A of Table 5 presents the results on the margin equation at the zipcode level.14 That is, all variables are now defined at the narrow level of municipalities. There are 2,383 municipalities in Austria. Of these, 1,173 do have gasoline stations. We have all the relevant data for 803 zip-code areas. On average, there are 2.4 stations per zip-code area and provided there is a station the range is 1 to 46 stations. Thus this market definition is very narrow. As can be inferred from Panel A in Table 5, our results are robust to this change in market definition. Again, 2SLS estimates and restricting the sample to zipcodes outside of Vienna increases the estimated influence of lnS on the margin, consistent with prior reasoning. The measures of market concentration take on a positive sign and—with the exception of C1—are significant at the 5% level or better. The share of independent marketers decreases the margin that can be charged and the area share of alps and woods as a measure of transport costs increases the margin. These estimates imply that the operational definition of market boundaries does not change our results, with the possible exception of the influence of market (ownership) concentration. A few words seem in order to explain the validity of our distance measure S. S is a good (inverse) proxy for the average distance between gasoline stations if stations do not cluster in one spot in each market. That is, if entry decisions are taken as suggested by models of spatial competition under subsequent price competition (maximum differentiation), stations optimally locate as far away from each other as possible and S is an appropriate distance measure.15 If stations do cluster, on the other hand, station density S may vary cross sectionally without changing the average distance between stations by much. We therefore need to assess whether clustering of gasoline stations is a problem. Figure 3 presents a frequency distribution of the number of stations per zip code. In 681 or 58.1% of the 1,173 zip codes with stations, there is only one station. In 85.8% of the zip codes, there are three or fewer stations, in only 31 zip-code areas, there are more than ten stations. This overwhelmingly suggests that clustering of stations does not occur on average, and thus that S is an appropriate measure of distance. Our distance measure S should work best in zip codes with only one station. If we restrict the sample to those 681 zipcodes with only one station, and estimate a regression like in Eq. 4 of Panel A in Table 5 by 2SLS, the coefficient and significance of lnSz remain virtually unchanged (−0.025, z= −2.88). This again suggests that S is an appropriate measure of distance. Finally, it should be mentioned that our results are not altered if we estimate an equation like 4 of Panel A in Table 5 at the station level, that is treating the 1,604 stations with price data as the unit of analysis, including district fixed effects, and essentially blowing up all explanatory variables. Given that the number of observations increases to more than 200,000, t-values on lnS increase to between 40 and 50. We chose not to report these results, because we view these t-values as inflated given the fact that our proxies of demand, costs and competition do not vary on a daily basis.

14 We

also analysed Eq. 3 at the zip-code level. Results mimic those obtained at the district level. the symmetric circle model of Salop (1979) with consumers being uniformly distributed, stations are equi-spaced around the circle in equilibrium. 15 In

240

G. Clemenz and K. Gugler

800 700

600

Frequency

500

400 300 200

100 0 1

2

3

4

5

6

7

8

9

10

>10

Number of stations

Fig. 3 Frequency distribution of the number of stations at the zip code level

4.3.2 Unobservable heterogeneity across sub-markets A possible omitted variables bias would arise if more stations enter in exactly the markets with lower marginal costs, thus introducing a spurious negative correlation between station density and price, when in fact lower marginal costs are responsible for the findings. In Table 4 we effectively allowed marginal costs to vary across the nine federal states in Austria by including fixed federal state effects, which were highly significant. Although we additionally allowed for ALPS to influence marginal costs, it may be that marginal costs (or other unobservable factors affecting the pricing decision) vary across districts in ways we did not control for yet. For example, there may be some districts where road availability or quality is worse than in others even in the same federal state or given ALPS, and thus transport costs are higher. If we introduce 120 district dummies in addition to the constant term and estimate an equation like 4 in Panel A of Table 5 by 2SLS, the F-test on the fixed district effects is 8.30 indicating significance beyond the 1% level.16 The results on the other variables, however, remain unchanged. In particular, the coefficient on lnSz rises to −0.031 (z=−3.21). This suggests that while there are unobservable differences across sub-markets affecting price, differential marginal costs at the district level are not responsible for our main findings. 4.3.3 Simultaneous determination of the entry and pricing decisions Thus far we have assumed that station density is a predetermined variable with respect to price. Equilibrium price and density of stations, however, may be jointly 16 This

essentially assumes that the error terms for average prices in different zip codes are correlated within districts, whereas the error terms for zip codes located in different districts are independent.

Locational choice and price competition

241

determined. Higher equilibrium price and therefore margins should lure gasoline operators to enter the market, while higher station density should depress equilibrium prices. We have already presented 2SLS estimations, however, these do not explicitly take into account that the entry and pricing decisions may be taken simultaneously. As a final test of robustness, therefore, we test whether our results hold up if we estimate Eqs. 3 and 4 simultaneously by the full-information method 3SLS. All dependent variables are now explicitly endogenous to the system and as such are treated as correlated with the disturbances in the system’s equations. Panel B in Table 5 presents the results. We present the results for the district level as the definition of the relevant region. Our results on both equations are not altered if we treat equilibrium margin and density of gasoline stations as jointly determined variables. While lnMk takes on the expected positive coefficient in the density equation, the coefficient is insignificant and does not alter the influence of the demand and concentration variables. The coefficient on lnSk remains negative and significant beyond the 1% level in the margin equation, even after controlling for the endogeneity of pricing and entry decisions, and the cross equation residual correlation. 4.3.4 Price dispersion Although spatial competition models do not provide clear-cut explanations of price dispersion, it seems plausible that an increase in seller density by increasing competition between stations will decrease average price dispersion in a given market. Several authors have analyzed this topic for the retail gasoline market.17 We obtain the following results for the 492 zip codes with at least two gasoline stations: sd ln Pi;j;t;z t;z ¼ 0:00852 0:00138 * ln Sz þ t;z t ¼ 110:4 t ¼ 34:1 R2 ¼0:3216;No:Obs: ¼74; 292 where sd(ln Pi,j,t,z)t,z is the standard deviation of the logarithm of price of station i of firm j on day t in zip code z over all stations in z on day t. Thus, as possibly expected, we find that price dispersion is significantly lower in zip codes with a larger station density. We additionally included in the above regression four firm dummies (F= 41.10) and 120 district dummies (F=370.4). Thus, price dispersion significantly differs across oil companies and districts. It should be noted that these results are not sensitive to the choice of geographical unit. 5 Conclusions We have shown that the Austrian retail gasoline market conforms quite well to the main predictions of spatial competition models. That is, the density of stations rises less than proportionally with population density, since fiercer competition drives

17 See

Marvel (1976); Png and Reitman (1994); Adams (1997) and Barron et al. (2001).

242

G. Clemenz and K. Gugler

price down. Equilibrium price and price dispersion are lower if competitors are nearer. Estimation as simultaneous equations confirms that causality runs from station density to price. We have also found that market concentration reduces the density of stations in a given region, however, we could not establish a consistent relationship of concentration and price. It appears that the main effects of concentration are on the entry decisions rather than on the pricing decisions. Our results suggest that spatial competition is an appropriate benchmark for judging the intensity (or lack thereof) of competition in the retail gasoline market. Thus, by explicitly recognizing the spatial dimension of markets, competition authorities can identify market conduct, and need not rely on market concentration–price studies with the involved problems of reverse causality and endogeneity. It should be kept in mind, however, that competition in the retail gasoline market is not as simple as the basic model of spatial competition would have it. The price setting mechanism in reality may be quite intricate. In particular, prices are in general not set by individual gas stations. Stations can be owned and operated by the big companies directly, they can be owned and operated by independent dealers, and in between several combinations of these two extremes are possible. These refinements are certainly fruitful areas of future research. Acknowledgement We wish to thank the seminar participants at the EARIE 2002 meeting in Madrid, at the University of Linz and at the University of Vienna for their helpful comments. In particular, we are grateful to L. H. Roeller and G. Götz. We also appreciate the comments of two anonymous referees as well as the efficient editorial handling of this paper.

Appendix In order to illustrate Eqs. 1 and 2 in the main text, we use the model of spatial competition of Salop (1979). The following summary is borrowed from Chapter 6 of Anderson et al. (1992), where further details can be found. Assume that there is a continuum of consumers with measure N. They are uniformly distributed around a circle of circumference L, with density N/L. Each consumer buys one unit of the good at that shop where her total costs are smallest. Denote the location of consumer j as Zj, and the location of shop i as zi. The transport costs are given by (5) Tji ¼ Zj zi

where ∣Zj−zi∣ is the length of the shortest arc linking Zj and zi on the circle, and τ and β are strictly positive parameters, with β ≥1. Now suppose there are n identical shops which are equi-spaced around the circle, hence the distance between two successive shops equals L/n. Finally, denote the marginal costs of each shop as c. It can be shown that in a symmetric equilibrium the price is given by P * ¼ c þ 21 ðn=LÞ :

(6)

Note that n/L corresponds to S, the density of gasoline stations, in the general case discussed in Section 2. Obviously, Eq. 6 is a special case of Eq. 2. Denoting

Locational choice and price competition

243

the fixed entry costs as K the equilibrium profit π* can be written as a function of the number of firms. *ðnÞ ¼ N 21 L n

1

K

(7)

In the complete model entry decisions take place in the first stage and price competition takes place in the second stage. It is assumed that relocation of shops is costless, and it can be shown that in equilibrium shops will be equi-spaced as has been assumed above. Entry takes place as long as Eq. 7 remains non-negative if an additional firm enters the market. The equilibrium number of firms per unit of distance is given by ne =L ¼

21 N K L

1 1þ

(8)

Note again that N/L corresponds to the population density D in Eq. 1. Clearly, Eq. 8 can be considered as a special case of Eq. 1. References Adams AF (1997) Search costs and price dispersion in a localized, homogeneous product market: some empirical evidence. Review of Industrial Organization 12:801–808 Anderson RW, Johnson RN (1999) Antitrust and sales-below-cost laws: the case of retail gasoline. Review of Industrial Organization 14:189–204 Anderson SP, de Palma A, Thisse J-F (1992) Discrete choice theory of product differentiation. MIT, Cambridge, Massachusetts and London Asplund M, Sandin R (1999) Competition in interrelated markets. Int J Ind Organ 17:353–369 Baltagi BH, Griffin JM (1997) Pooled estimators vs. their heterogeneous counterparts in the context of dynamic demand for gasoline. J Econom 77:303–327 Barron JM, Taylor BA, Umbeck JR (2001) Seller Density, and Price Dispersion: A Theoretical and Empirical Investigation, mimeo Barros, PB (1999) Multimarket competition in banking, with an example from the Portuguese market. Int J Ind Org 17:335–352 Beath J, Katsoulacos Y (1991) The economic theory of product differentiation. Cambridge University Press, Cambridge Beckmann MJ, Thisse JF (1986) The Location of Production Activities. In: Nijkamp P (ed) Handbook of Regional and Urban Economics, vol I, Elsevier, Amsterdam Borenstein S (1991) Selling costs and switching costs: explaining retail gasoline margins. RAND J Econ 22:354–369 Borenstein S, Shepard A (1996) Dynamic pricing in retail gasoline markets. RAND J Econ 27:429–451 Bresnahan TF, Reiss PC (1990) Entry in monopoly markets. Rev Econ Stud 57:531–553 Bresnahan TF, Reiss PC (1991) Entry and competition in concentrated markets. J Polit Econ 977– 1009 Capozza DR, Van Order R (1980) Unique equilibria, pure profits, and efficiency in location models. Am Econ Rev 70:1046–1053 Castania R, Johnson H (1993) Gas wars: retail gasoline price fluctuation. Rev Econ Stat 75:171– 174 Claycombe RJ, Mahan TH (1993) Spatial aspects of retail market structure: beef pricing revisited. Int J Ind Org 11:283–291 Considine TJ (2001) Markup pricing in petroleum refining: a multiproduct framework. Int J Ind Org 19:1499–1526 Eaton BC, Wooders MH (1985) Sophisticated entry in a model of spatial competition. RAND J Econ 16:282–297 Evans WN, Froeb LM, Werden GJ (1993) Endogeneity in the concentration–price relationship: causes, consequences, and cures. J Ind Econ XLI:431–438

244

G. Clemenz and K. Gugler

Fik TJ (1988) Spatial competition and price reporting in retail food markets. Econ Geogr 64: 29–44 Hotelling, H (1929) Stability in competition. Econ J 39:41–57 Johnson RN, Romeo CJ (2000) The impact of self-service bans in the retail gasoline market. Rev Econ Stat 82:625–633 Martin S (1993) Advanced industrial economics. Blackwell, Cambridge, Massachusetts and Oxford Marvel HP (1976) The economics of information and retail gasoline price behavior: an empirical analysis. J Polit Econ 84:1033–1060 Phlips L (1995) Competition policy: a game-theoretic perspective. Cambridge University Press Pinkse J, Slade ME, Brett C (2001) Spatial price competition: a semiparametric approach. Econometrica (forthcoming) Png IPL, Reitman D (1994) Service time competition. RAND J Econ 25:619–634 Puwein W, Wüger M (1999) Der Kraftstoffmarkt in Österreich. WIFO-Studie, Wien Salop SC (1979) Monopolistic competition with outside goods. Bell J Econ 8:141–156 Schmalensee R, Stoker TM (1999) Household gasoline demand in the United States. Econometrica 67:645–662 Shepard A (1993) Contractual form, retail price, and asset characteristics in gasoline retailing. RAND J Econ 24:58–77 Slade ME (1986) Exogeneity tests of market boundaries applied to petroleum products. J of Ind Econ 34:291–303 Slade ME (1987) Interfirm rivalry in a repeated game: an empirical test of tacit collusion. J Ind Econ 35:499–516 Slade ME (1992) Vancouver’s gasoline-price wars: an empirical exercise in uncovering supergame strategies. Rev Econ Stud 59:257–276 Slade ME (1996) Multitask agency and contract choice: an empirical exploration. Int Econ Rev 37:465–486 Slade ME (1998) Strategic motives for vertical separation: evidence from retail gasoline markets. J Law Econ Organ 14:84–113 Tirole J (1988) The theory of industrial organization. MIT, Cambridge MA and London von Weizsäcker CC (2002) Kollektive Marktbeherrschung im Rahmen der staatlichen und internationalen Fusionskontrolle. In: Franz W, Ramser HJ, Stadler M (eds) Fusionen, Mohr Siebeck, Tübingen Weiss LW (ed) (1989) Concentration and Price. MIT, Cambridge, Massachusetts White H (1980) A heteroscedasticity-consistent covariance matrix estimator and a direct test for heteroscedasticity. Econometrica 48:817–838

Dynamic spatial modelling of regional convergence processes Reinhold Kosfeld · Jorgen Lauridsen

Abstract. Econometric analysis of convergence processes across countries or regions usually refers to a transition period between an arbitrary chosen starting year and a ﬁctitious steady state. Panel unit root tests and panel cointegration techniques have proved to belong to powerful econometric tools if the conditions are met. When referring to economically deﬁned regions, though, it is rather an exception than the rule that coherent time series are available. For this case we introduce a dynamic spatial modelling approach which is suitable to trace regional adjustment processes in space instead of time. It is shown how the spatial error-correction mechanism (SEC model) can be estimated depending on the spatial stationarity properties of the variables under investigation. The dynamic spatial modelling approach presented in this paper is applied to the issue of conditional income and productivity convergence across labour market regions in uniﬁed Germany. Key words: Regional convergence, dynamic spatial models, spatial unit roots, spatial error-correction JEL classiﬁcation: C21, R11, R15

1. Introduction When analysing convergence processes of countries, time-series of the core variables of growth theory, namely production, income and employment, are available from publicly accessible data-bases. With some restrictions the same applies to indicators for control variables such as e.g., investment rate, human

We would like to thank an anonymous referee for his helpful comments.

R. Kosfeld (B) University of Kassel, Department of Economics, Nora-Platiel-Str. 5, 34127 Kassel, Germany (e-mail: [email protected]; www.wirtschaft.uni-kassel.de/Kosfeld) J. Lauridsen University of Southern Denmark, Department of Economics, Campusvej 55, 5230 Odense M, Denmark (e-mail: [email protected]; www.sam.sdu.dk/ansat/jtl)

246

R. Kosfeld, J. Lauridsen

capital, innovation, policy instruments. In this situation it seems to be advantageous to investigate adjustment processes of economic growth in a combined cross-section and time-series analysis by means of panel unit root tests and panel cointegration techniques. Convergence studies for panels of countries using this kind of econometric analysis were conducted e.g. by Evans and Karras (1996); Evans (1998); Holmes (2000); Ko´nya (2001). Although panel unit root tests can increase the degrees of freedom considerably they oﬀer by no means a ‘‘free lunch’’. In contrary to cross-sectional analysis the problems of structural stability can prove to be a serious obstacle. In addition the researcher has to cope with the loss of uniqueness which goes along with the application of panel unit root tests.1 A serious disadvantage of most panel convergence studies is the insuﬃcient modelling of cross-sectional dependence. In regional convergence studies a panel analysis of adjustment processes is often not feasible. Generally it is only at the state level that quarterly or yearly data on the relevant economic variables are available for a suﬃciently long time period.2 When focussing on functional regions production and income data are generally available only from structural surveys which are carried out in Germany in time spans of at least two years. In our view the deﬁnition of functional regions is highly relevant in convergence analysis, since whether a spatial unit is to be regarded as rich or poor crucially depends on the assignment of the surrounds to a relevant regional centre (see e.g., Eckey et al. 1990, pp. 1). Apart of the long time interval between the surveys, regional data are usually subjected to changes of nomenclatures which can restrict their comparability to a large extent. As far as convergence between West and new East Germany is concerned, in view of the sample size, analysis cannot even be performed at the state level in the time dimension. The question arises if it is at all possible to render regional adjustment processes transparent when panel analysis is not operational. The idea of tracing regional adjustment processes between two points in time only from spatial data is an outcome of new developments in spatial econometrics. They started oﬀ with a seminal paper of Fingleton (1999) where he introduces the concepts of spatial cointegration and spatial error correction models. He shows that not only time trends but spatial trends, too, can lead to spurious regression with severe consequences concerning statistical inference. Lauridsen (2002) analyses the dynamics of adjustment based on a spatial autoregressively distributed lag model (local model) to a global equilibrium. For model estimation spatial properties of the involved variables have to be identiﬁed. This can be done by applying a powerful testing strategy recently proposed by Lauridsen and Kosfeld (2002). In this paper we aim to trace adjustment processes across functionally deﬁned regions by means of dynamic spatial models. Section two outlines the growth theoretical basis consisting of an extended Solow model in which capital accumulation takes place not only in physical capital but in human

1

See also Vorbeek (2000, p. 334) who argues that ‘‘neither the null nor the alternative hypothesis’’ in panel unit root tests ‘‘is satisﬁed and it is unclear whether we would wish our test to reject or not’’. 2 Convergence studies for West German states on the basis of panel unit root tests are conducted by Bohl (1998) and Funke and Strulik (1999).

Dynamic spatial modelling of regional convergence processes

247

capital as well. In section three the global model and local models are developed for the implied growth relationship. It is shown that a spatial errorcorrection mechanism turns out to be a special representation of the dynamic spatial setting. Moreover, issues regarding model estimation and testing are addressed. Section four contains a description of the regional data set for investigating conditional income and productivity convergence in uniﬁed Germany. The empirical ﬁndings are discussed in section ﬁve. Section six concludes. 2. Growth theoretic basis In empirical studies of growth, human capital provides a signiﬁcant contribution to explanations of the variation of labour productivity even in a neoclassical modelling framework (see e.g., Mankiw et al. 1992; Seitz 1995; Islam 1995; Niebuhr 2001). Stressing the importance of human capital as an input factor, Lucas (1988) modelled the production function for human capital diﬀerently from that for other goods. Here we adopt the view of Mankiw et al. (1992, pp. 416) who suppose that both production functions are not fundamentally diﬀerent (see also Romer 1996, pp. 126). The regional production functions in the augmented Solow model are of type Cobb-Douglas:3 Y ðtÞ ¼ KðtÞa H ðtÞb ½AðtÞ LðtÞ1

a b

:

ð2:1Þ

Y, K, H, A, and L denote the level of output, physical capital, human capital, technology and labour input of a region considered at time t, respectively; AÆL denotes regional labour input in eﬃciency units. The parameters a and b (0 < a < 1, 0 < b < 1) are the production elasticities of physical and human capital; 1 a b > 0 is the elasticity of labour input. In competitive markets the input factors are paid their marginal products. Labour L and level of technology A are assume to grow exogenously at rates n and g. While technology growth g is supposed to be uniform in all regions of the economy, the growth rate of population, n, generally diﬀers from region to region. To trace the evolution of production, physical and human capital in the economy we deﬁne the variables in labour eﬃciency units: ^y ¼ Y =ðA LÞ; ^k ¼ K=ðA LÞ and ^ h ¼ H =ðA LÞ: With constant fractions of income invested in physical and human capital, sk and sh , a regional economy evolves according to the diﬀerential equations4 _ ^kðtÞ ð2:2Þ ¼ sk ^y ðtÞ ðn þ g þ dÞ ^kðtÞ and _ ^ hi ðtÞ ¼ sh ^y ðtÞ

3

ðn þ g þ dÞ ^ hðtÞ;

ð2:3Þ

It is assumed that (2.1) underlies the production of consumption, physical and human capital. The goods can be transformed less in either of each utilisation. 4 A dot above a variable describes its derivation with respect to time: x_ ¼ dx=dt

248

R. Kosfeld, J. Lauridsen

where d denotes the uniform depreciation rate of physical and human capital. If there are decreasing returns to ‘‘aggregate’’ capital (a þ b < 1), a region converges to its steady-state 1 b b ^k ¼ ð sk sh Þ1=ð1 nþgþd

a bÞ

ð2:4Þ

a bÞ

ð2:5Þ

and sa s1 a 1=ð1 ^ h ¼ð k h Þ nþgþd

where labour productivity y=Y/L is given by y ¼ Að0Þ egt ð

sak sbh ðn þ g þ dÞ

aþb

Þ1=ð1

a bÞ

ð2:6Þ

:

Since the parameters n, g and d as well as the quantities sk and sh can diﬀer from region to region in general only conditional convergence applies. Unconditional convergence would presuppose a catching-up by poorer regions without a need to control for regional-speciﬁc diﬀerences. Mankiw, Romer and Weil (1992, pp. 410) consider the log of A(0) to be composed of a constant c which is common to all cross-sectional units and a country-speciﬁc shock u: ln Að0Þ ¼ c þ u :

ð2:7Þ

In regional analysis u can be viewed to include diﬀerent levels of technology, diﬀerent regional ineﬃciencies (Schalk et al. 1995, pp. 26), a diﬀerent composition of produced goods and other regional-speciﬁc characteristics. As a regional-speciﬁc shock u ultimately captures all random variation in regional labour productivity y. Using the composition (2.7) the equilibrium relationship (2.6) has the log-linearized form ln y ¼ d

1

aþb ln ðn þ g þ dÞ þ a b 1

a a

b

lnsk þ

1

b ln sh þ u a b

ð2:8Þ

with y ¼ y and d ¼ c þ gÆt.5 According to (2.8), in the steady state, regional labour productivity is determined by population growth, growth technology, depreciation of capital and physical and human capital accumulation. With regard to the region-speciﬁc variables we can establish a negative dependence of labour productivity to population growth and a positive dependence on both kinds capital accumulation. 3. Modelling spatial processes 3.1. Spatially integrated processes and stationarity In order to analyse local adjustment processes we have to introduce the concepts of spatial stationarity and spatial cointegration. We start with the

5

For a cross-section regression the time index t is ﬁxed. Hence, the term gÆt is a constant can be added to the common shock c to give the intercept of equilibrium relationship (2.8).

249

Dynamic spatial modelling of regional convergence processes

ﬁrst-order autoregressive process as a spatial data generating process for a variable y which is given in matrix notation by y¼qWyþe

ð3:1Þ

with y ¼ ðy1 ; y2 ; . . . ; yn Þ‘. In our cross-sectional analysis the components of y refer to the n regions of an economy. The disturbance vector e=(e1,e2,…en)‘ is assumed to follow a normal distribution with an expectation vector of zero and a scalar covariance matrix: e Nð0; r2 IÞ:

ð3:2Þ 2

q denotes an autoregressive parameter and r the variance of the disturbances ei. W deﬁnes an nxn contiguity matrix with non-zero entries for spatially contiguous regions. Let W* be an nxn neighbourhood matrix which entries W ij take only the values 1 and 0: n 1 if regions i and j are neighbours W ij ¼ 0 otherwise: The entries of W result from a row normalisation of W* which isP achieved by W ij . Thus dividing the elements of the ith row of W* by the ith row sum j

the ith element of the nx1 vector WÆy is the mean of the variable yi in the neighbourhood regions of i. In spatial econometrics one must be cautious when wishing to interpret the autogressive parameter q as an autocorrelation coeﬃcient as in time series analysis. Generally, in maximum likelihood estimation, the likelihood function ensures that the autoregressive parameter lies in interval [1/xmin, 1/xmax] hence the bounds are the reciprocals of the minimum and maximum eigenvalues xmin and xmax of the weight matrix. For the row-normalised matrix W xmax=1 and hence q£1 is ensured, but not xmin=)1 (Anselin 1982). However using instrumental variables, there is no guarantee that the estimate will fall within this interval, and this may lead us to uncertain areas of interpretation and inference, for example associated with the existence of spatial unit roots. From simulation studies (e.g., Kelejan and Robinson 1995) the range of the autoregressive parameter q appears to be considerably narrower when spatial analysis is conducted on the basis of an unstandardised weight matrix W*. In accordance with Fingleton (1999) we adopt jqj < 1 for the data generating process to be stationary, although the validity of this inequality is not a suﬃcient condition for it. However, asymptotically stationarity is ensured for jqj < 1. Taking this restriction into account it is straightforward to call y, generated by equation (3.1), a spatially integrated process of order one [SI (1)]; for jqj < 1 y is called spatially stationary [SI(0)]. An SI(1) variable y said to have a unit root. It has to be spatially diﬀerenced once, Dy ¼ y

Wy ¼ ðI

WÞy;

to become stationary. In general, a spatially integrated process of order d, SI(d), has d unit roots. It becomes stationary after applying the spatial difference operator D ¼ I W d times: Dd y ¼ ðI

WÞd y:

250

R. Kosfeld, J. Lauridsen

3.2. Spatial cointegration and spatial dynamics Let x and y be both SI(1) variables. Then in general any linear combination of x and y is also SI(1). If, however, a linear combination y-bx exists which is stationary, x and y are said to be spatially cointegrated. In this case the cointegrating vector is given by (1 -b). More generally, x and y are both SI(d) variables. For a linear combination y-bx of lower order of spatial cointegration than d, say SI(d-b) with 0
ð3:3Þ

when using the above deﬁned variables. y, x1, x2 and x3 are nx1 vectors which components are regionally determined, i is the nx1 unit vector and u an nx1 vector of disturbances, u N ð0; XÞ where W denotes an nxn covariance matrix. The parameters b0, b1, b2 and b3 measure the eﬀects of the exogenous variables 1, x1, x2 and x3 on labour productivity y. As is well-known from time-series analysis in our multiple variable case the existence of a cointegrating vector (1 )b0 )b1 )b2 )b3) does not necessarily require the variables y, x1, x2 and x3 to be of the same order of integration (Charemza and Deadman 1992, pp. 147). In case of conditional convergence regional adjustment towards the global equilibrium will arise. In contrary, when convergence is missing local discrepancies will tend to persist. Thus we have to investigate what the kind of spatial dynamics is driving the economy. In time series econometrics adjustment processes are evaluated by means of an error-correction model (ECM). The analogous local model construction to capture regional dynamics is called spatially error-correction model (SEC model) (Fingleton 1999; Lauridsen 2002). Local developments can be imagined to spread out at ﬁrst in the neighbourhood regions before diﬀusing over the whole economy. Indeed, observed spatial correlations seem to conﬁrm a marked spatial dimension of regional adjustment processes (see e.g., Kosfeld et al. 2002). They may be probably attributed to rigidity barriers such as substantial costs which prevent economic agents to adjust instantaneously to new information. Not only the spatially lagged values of the dependent variable but also those of the exogenous variables generally have to be taken into account during the transition periods. In the simplest case where only spatial lags of ﬁrst order are allowed for, a local model can be establish in the form y ¼ a0 i þ a1 Wy þ b10 x1 þ b11 Wx1 þ b20 x2 þ b21 Wx2 þ b30 x3 þ b31 Wx3 þ m X ðb x þ bi1 Wxi Þ þ m ð3:4Þ ¼ a0 i þ a1 Wy þ i¼1;2;3 i0 i

251

Dynamic spatial modelling of regional convergence processes

with m as an nx1 disturbance vector: m N ð0; r2m IÞ: A local model of the form (3.4) is termed spatial autoregressively distributed lag model (SADL model) (see Lauridsen, 2002). It can be easily transformed to obtain a link to the global model (3.3): y ¼ j0 i þ j1 x1 þ j2 x2 þ j3 x3 þ h0 Dy þ h1 Dx1 þ h2 Dx2 þ h3 Dx3 þ m ð3:5Þ with j0 ¼ h0 ¼

a0 1

a1

; j1 ¼

a1 1

a1

b10 þ b11 b þ b21 b þ b31 ; j2 ¼ 20 ; j3 ¼ 30 1 a1 1 a1 1 a1

; h1 ¼

b11 ; h2 ¼ 1 a1

b21 and h3 ¼ 1 a1

b31 : 1 a1

An adjustment to the global model (3.3) can only arise if the spatial lag coeﬃcient a1 lies in the interval (0, 1), since regional discrepancies would persist otherwise. In case of convergence spatial diﬀerences in the variables decrease more and more during the adjustment process. This means that ultimately the local model (3.5) degenerates with the global model (3.3). 3.3. Spatial error-correction Some easy manipulations of (3.4) provide the equivalent representation X Dy ¼ a0 i þ ða1 1ÞWy þ ðb Dxi þ bi1 WDxi Þ þ e; ð3:6Þ i¼1;2;3 i0

where D = (I-W). Further manipulations result in X X Wx Þ þ b Dxi Dy ¼a0 i þ ða1 1ÞðWy i i¼1;2;3 i¼1;2;3 i0 X þ ðb þ bi1 þ a1 1ÞWxi þ e: i¼1;2;3 i0

Alternatively, (3.7a) can be rewritten as X X Dy ¼ a0 i þ ða1 1ÞðWy j Wxi Þ þ b Dxi þ e: i¼1;2;3 i i¼1;2;3 i0 A ﬁnal set of manipulations provide X y ¼ jo i h0 Dy þ ðj x i¼1;2;3 i i

hi Dxi Þ þ j0 e:

ð3:7aÞ

ð3:7bÞ

ð3:8Þ

The forms (3.6), (3.7) and (3.8) are algebraically equivalent to (3.4), but provide diﬀerent interpretations. Equation (3.6) is a spatial generalization of the time series Baardsen speciﬁcation, which we will denote the SBA model. Models (3.7a) and (3.7b) generalize the Error Correction (EC) model and will be denoted as the SEC model. Finally, (3.8) is a generalization of the Bewley transfom which we will call the SBE model. In contrast to the SADL, the SBA and the SEC describe the formation of expected local diﬀerences in y as depending on local diﬀerences in x and locally lagged values in x. They are distinctive in that the SBA introduces locally lagged levels in y, whereas the SEC introduces the locally lagged discrepancy between y and x. Thus, in the

252

R. Kosfeld, J. Lauridsen

SEC, the term (a1-1) represents the local adjustment to any discrepancy. The SBE is especially interesting as it incorporates the global multipliers directly with j0 as the constant and j1 as the coeﬃcients for x. If the spatially lagged variables Wy, Wx1, Wx2 and Wx3 are spatially nonstationary, P this property transfers immediately to the error-correction term (Wy- i¼1;2;3 Wxi Þ in (3.7a). Here spurious regression prevents a meaningful estimation of the SEC form (3.7a). In contrary, for spatially stationary lagged variables the SEC model (3.7a) provides a straightforward estimation equation of the error-correction mechanism. If the spatially lagged variables turn out to be spatially nonstationary, the SEC form (3.7b) may be the focus of interest. This is the case for the spatially lagged variables Wy, Wx1, Wx2 and Wx3 being spatially cointegrated which ensures the existence P of a spatially stationary linear combination (Wy- i¼1;2;3 ji Wxi Þ. None of the speciﬁcations (3.4)–(3.8) can be estimated using OLS. This is due to the presence of contemporaneous y values in the variable Wy emerging in some form or another as an explanatory variable, implying correlation between Wy and e. For the case of the SAR, this is proved in details in Anselin (1988a, pp. 57), whereas Fingleton (1999) provides the proof for the SEC model. Their arguments are directly carried over to the SADL, SBA and SBE models. Due to the aforementioned correlation, asymptotically justiﬁed methodologies must be applied. The IV estimation is based on the idea of ﬁnding a variable z which is uncorrelated with e, but correlated with Wy (or whatever form in which y appears on the right-hand side of (3.4)–(3.8) and using this as an instrument variable in a one-step least squares estimation. Formally, if we want to estimate the SADL in (3.4), we deﬁne X0 = [x1 x2 x3], X ¼ [i Wy X0 WX0] and Z ¼ [i z X0 WX0], where i is an n · 1 vector of 1’s. Deﬁning cSADL = (a0 a1 ß0 ß1)’ and inserting the projections of the columns of X in the column space of Z (i.e. = PzX, where Pz = Z(Z’Z)-1Z’, see Greene 2002, p. 78), the IV estimator is c^SADL ¼ ðX0 PZ XÞ 1 X; PZ y where Pz = Z(Z’Z)-1Z’. The covariance matrix is given by VSADL ¼ r2 ðX; Pz XÞ

1

with r2 estimated consistently by s2 ¼ ðy

X^ cSADL Þ0 ðy

X^ cSADL Þ=n:

As a choice for z, Anselin (1988a, p. 85) suggests the lagged value of the prediction of y from an OLS regression on those variables in X not correlated with , i.e. x and Wx. Denoting the predicted y by ^y, the instrument variable is deﬁned as W^ y, and the IV estimator is obtained by setting Z ¼ ½i W^ y X0 WX0 . Using ^ y as an instrument for any occurrence of y on the right-hand sides, IV estimation of the alternative forms (3.4)–(3.8) is easily provided. The choices of X, Z, and dependent variable for (3.4), (3.6), (3.7), and (3.8) are summarized in Table 1. At ﬁrst sight, an obvious and tempting generalization of the IV approach seems to be inclusion of further spatial lags of ^y, i.e. W2^y, W3 ^y; . . . However, as pointed out by Fingleton (2000; 2001), this may lead to a risk of linear dependence among the columns of Z (see also Kelejian and Robinson 1993; Kelejian and Prucha 1998)..

253

Dynamic spatial modelling of regional convergence processes Table 1. Choices of X, Z and dependent variable Model

X

SADL SBA SEC SBE

[i [i [i [i

Z Wy X0 WX0] Wy DX0 WX0] (Wy-SjWxj) X0 WX0] Dy X0 DX0]

[i [i [i [i

Dep. var. W^y X0 WX0] W^y DX0 WX0] (W^y-SjWxj) DX0 WX0] D^y X0 DX0]

y Dy Dy y

As one possible further complication, the error terms for the single regions may be spatially autocorrelated. A recent Cochrane-Orcutt type generalization of the IV method allows one to adjust for this (see Kelejian and Prucha 1998). For matters of simplicity and focus, we refrain from incorporating this adjustment in the present investigation. Using the one-to-one correspondence between the parameters of the four models, IV estimators for cSADL may be derived from any of the four models upon IV estimation of the chosen one, just as the VSADL is easily derived using for example the delta method (Greene 2002). Of course, this is equivalent to a separate IV estimation of all four models, which is easier in practice. In the present study, separate IV estimations were used. This may lead to minor rounding-oﬀ errors in reported parameters. 3.4. Testing for spatial unit roots In order to know how to estimate the equilibrium relationship of the augmented Solow model (Eq. (2.8)), we need to know the degree of integration of the involved variables. If the variables are nonstationary but integrated with the same degree, a test of cointegration is straightforward. For diﬀerent degrees of integration a cointegration is only possible if special conditions are met. The present study suggests a strategy based on a twofold application of a Lagrange Multiplier test for spatially autocorrelated errors.6 The LM error statistic (LME) developed in Anselin (1988a, 1988b), LME ¼ ðe; We=r2 Þ2 =trðW2 þ W; WÞ;

ð3:9Þ

2

is asymptotically v distributed with one degree of freedom under H0: k ¼ 0. In the case of spurious regression, the error term of the regression y ¼ Xb þ e

ð3:10Þ

will contain a unit root, i.e e ¼ kWe þ l; l N ð0; r2 IÞ; with k ¼ 1. Therefore, a large LME value indicates either spatial nonstationarity or stationary (positive or negative) autocorrelation. This result corresponds to the suggestions of Fingleton (1999) with the Moran I test replacing the LM test. Next, under H0: nonstationarity, it follows that 6

In a Monte Carlo study Lauridsen and Kosfeld (2002) have shown the ﬁnite sample properties of the suggested test strategy to be satisfactory.

254

R. Kosfeld, J. Lauridsen

e ¼ We þ l , e ¼ D 1 l so that Dy ¼ DXb þ l;

ð3:11Þ

with D=I-W as the spatial diﬀerence operator. Equation (3.11) implies that a regression of Dy on DX provides a white noise error, so that a LM error test statistic for this spatially diﬀerenced model (DLME) will be close to zero. On the other hand, if H0: nonstationarity does not hold, then the spatial diﬀerencing will bring about a negative (stationary) spatial residual autocorrelation leading to a positive DLME value. Concluding, the test strategy consists of calculating and inspecting the LME and the DLME values, leading to one of three conclusions:7 Nonstationary, spurious regression (LME positive, DLME zero), stationary spatial autocorrelation (LME and DLME positive), and absence of autocorrelation (LME zero, DLME positive). It may be further relevant to investigate whether y or any of the x variables are spatially nonstationary. This may be revealed by using the suggested procedure for a regression of the variable in question (i.e. z being one of y; x1 ; x2 ; . . .) on a constant term. Speciﬁcally, the regressions z ¼ ai þ e

ð3:12Þ

and Dz ¼ aDi þ e

ð3:13Þ

readily provides the LME and DLME test statistics, which lead to one of three conclusions: z is spatially nonstationary (LME positive, DLME zero), z represents a stationary SAR scheme (LME positive, DLME positive), or z is free of any spatial pattern (LME zero, DLME positive).

4. Data The study of regional convergence in uniﬁed Germany refers to the state of development in 2000 i.e., about a decade after the uniﬁcation. Although oﬃcial statistics provides data for disaggregated administrative areal units, our notion of a region is economic in nature. Making no allowance for economic relationship in space is expected to result in distortions regarding economic conditions and development (see Eckey et al. 1990). For this Eckey (2001) has deﬁned German functional regions by aggregating districts (Kreise) on the basis of commuter ﬂows. The functional regions arising in this way are called ‘regional labour markets’. Starting from 440 German districts Eckey (2001) constructed 180 German labour markets of which 133 are mainly located in West Germany and 47 in East Germany.8

7

The test result is termed to be ‘‘positive’’ if the LM test statistic diﬀers signiﬁcantly from zero and ‘‘zero’’ otherwise. 8 There are three overlapping regions which consists of a majority of West German districts. Therefore they are labelled as West German regions.

Dynamic spatial modelling of regional convergence processes

255

Since growth theory takes full employment for granted, the convergence relationship can be applied to both income per capita and labour productivity.9 Both indicators are calculated in real terms, where district data on gross domestic product (GDP), employment and population have been aggregated and state data on the GDP price index have been disaggregated to match with the regional labour markets concept. The data stem from the ‘‘National Accounts of the States’’ (‘‘Volkswirtschaftliche Gesamtrechnung der La¨nder’’) compiled by the Statistical State Oﬃce Baden-Wu¨rttemberg. In the augmented Solow model the sum of population growth, capital depreciation and growth of technological progress enters as an exogenous variable. Mankiw et al. (1992, p. 413) and Islam (1995, p.1139) e.g., view the last two components to be constant in their country samples and set them equal to 0.05 in order to ‘‘match the available data’’.10 Since for uniﬁed Germany regional diﬀerentiated depreciation rates are not available either, we have calculated a uniform average depreciation rate of 4.8% from data on depreciation and invested capital which proves to be very stable in the nineties (Statistisches Bundesamt 1999, 2001). The choice of the rate of technological progress is based on an empirical study of Gro¨mling (2001) who estimated a value of 0.6% for the uniﬁed Germany in the period 1992-1999. Investment rates for the overall regional economies as measures of regional savings rates sk are not available on the disaggregation level required. Regional investment rates are only available for the industrial sector. Because the industrial sector no longer represents even the largest sector of the economy, there is a danger that distortions may produce uncontrolled eﬀects when working with such a restricted indicator. That is why we prefer to measure regional investment intensity by the newly established enterprises per capita. Regional data on newly established businesses are available for 2000 on the CD ‘‘Statistik regional’’ which is oﬀered by the Federal Statistical Oﬃce Germany. Since investment in human capital is much more diﬃcult to measure than investment in physical capital, we substitute sh in convergence Eq. (2.8) by an indicator of the level of human capital.11 Human capital is in general viewed as labour qualiﬁcations acquired in education and training. In West German regional growth studies the proportion of working population with a university degree or a degree at an advanced technical college is used as an indicator for human capital (see Seitz 1995; Niebuhr 2001). Due to data accessibility it is usual to only refer to that part of population bounded by law to the social security system. Besides the self employed, the most notable other omissions are all oﬃcials and civil servants. To reduce distortion eﬀects as far as possible we construct a comprehensive human capital indicator which comprises oﬃcials and civil servants. The two highest career groups of civil servants are well matched with the degrees of the employees being

9

Formally the equality of both concepts is established by normalising the labour participation rate to 1. In applied work a diﬀerentiation between the two concepts is necessary. 10 Mankiw et al. (1992), p. 413. In both studies the deprecation rate is set equal to 0.03, whereas for the rate of technological progress a value of 0.02 is chosen. 11 Formally, if ln(sh) is substituted by the log level variable H, equation (2.8) changes insofar as the production elasticity of human capital, ß, now only appears in the numerator of the coeﬃcient of ln(H). See Mankiw et al. (1992), p. 418.

256

R. Kosfeld, J. Lauridsen

Table 2. Variables used in the empirical study Variable

Deﬁnition

Mean

S.D.

Min

Max

LGDPER

Log gross domestic product per total employment 2000 Log gross domestic product per capita 2000 East-West Dummy Log (depreciation rate + rate of technical progress + growth rate of population) (Averages resp. representative values for 90ties) Log proportion of highly educated people per total employment 2000 (Secondary school + technical college + university degree) Log newly founded business per 1000 inhabitants 2000

10.72

0.16

10.29

11.12

20.94

4.88

12.07

40.32

0.26 )2.89

0.44 0.12

0 )3.17

1 )2.60

2.55

0.28

1.98

3.41

1.90

0.17

1.51

2.34

5.22

1.90

1

LGDPCR EAST LDTW

LHUMAN

LNBF

Proximity matrix: W* Neighbourhood matrix for N=180 German labour marketsb Number of links per labour market Density of W* = .029 W Row standardization of W*

12

Data constructed for N = 180 German labour markets from district and state data. Source: aVolkswirtschaftliche Gesamtrechnung der La¨nder (Statistical State Oﬃce BadenWu¨rttemberg); Statistik regional, Statistisches Jahrbuch (Federal Statistical Oﬃce Germany); German statistical state oﬃces; Own construction. b University of Kassel, Department of Economics (see Eckey 2001).

bound to the social security system. Disaggregated data on the qualiﬁcations and careers of the working population in 2000 have been provided by the German Federal Statistical Oﬃce and the German statistical state oﬃces. 5. Empirical evidence on German regional convergence We investigate the conditional convergence hypothesis with respect to income per capita and labour productivity within the dynamic spatial setting provided by the spatial autoregressively distributed lag model (SADL model). With human capital one potential growth relevant factor neglected in the neoclassical Solow model is additionally taken into account. However, there may be a lot of other factors e.g., technical eﬃciency, industrial organisation, conditions of competition, research and development, policy measures which have to be controlled for when studying the convergence process. Since they are assumed to diﬀer especially between East and West Germany due to the formerly diﬀerent economic systems, we introduce an East-West dummy in order to control for growth relevant factors not explicitly modelled. In this way dynamic spatial convergence analysis across German labour markets can be conducted in a tractable manner. As a point of departure the baseline OLS estimation of the global model 3.3 together with LM error tests for spatial stationarity is displayed Table 3. Both the income and productivity relationship are capable of meaningful

257

Dynamic spatial modelling of regional convergence processes

economical interpretation. In both models human capital proves to be highly signiﬁcant. Although the coeﬃcient of investment intensity has the theoreticcally expected sign in either model it only shows signiﬁcance in the income model. This result may be due to imperfections of newly founded businesses as an investment indicator. Diﬀerences in population growth across German labour market regions seem to exert an eﬀect on income per capita but not on labour productivity. Evidently the high signiﬁcance of the East dummy supports the supposition that non-explicitly modelled control variables are relevant to explaining diﬀerences between East and West German regions. The results of the LM error tests conﬁrm that we are not concerned with the problem of spurious regression when estimation the global model (3.3). Like the LME statistic the DLME statistic is also highly signiﬁcant for all variables. This means that the null of a spatial unit root is rejected for all manifest variables. Consistently with this the errors of both models turn out to be spatially stationary, too. Results from the IV estimation of the SADL income and productivity models are shown in Table 4. The high coeﬃcient of the spatial lag of the dependent variable (W_Y) in the income model does not diﬀer signiﬁcantly from one which could indicate spatial nonstationarity in the Fingleton sense Table 3. Baseline OLS model estimation and LM tests for stationarity

Variable

Income model: LGDPCR

Productivity model: LGDPER

Coeﬃcient

Coeﬃcient

**

Stand. err. 0.445 0.037 0.144 0.039 0.068

Intercept EAST LDTW LHUMAN LNBF

10.018 )0.346** 0.317* 0.254** 0.138*

R2 SSE F

0.728 0.01525 117.17 (p<0.0001)

**

10.613 )0.290** 0.110 0.168** 0.041

Stand. err. 0.291 0.024 0.094 0.025 0.044

0.764 0.00652 141.72 (p<0.0001)

LM error tests for stationarity Variable

LME

p

DLME

p

LGDPCR LGDPER LDTW LHUMAN LNBF Residuals (1st model) Residuals (2nd model)

148.468 202.064 430.996 433.277 433.344 19.093 19.480

<0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001

29.240 26.107 7.946 13.290 26.508 40.602 41.017

<0.0001 <0.0001 0.0048 0.0003 <0.0001 <0.0001 <0.0001

Notes: LGDPER: Log gross domestic product per total employment; LGDPCR: Log gross domestic product per capita; EAST: East-West Dummy; LDTW: Log (depreciation rate + rate of technical progress + growth rate of population); LHUMAN: Log proportion of highly educated people per total employment; LNBF :Log newly founded business per 1000 inhabitants. ** 1% signiﬁcance level; 5% signiﬁcance level; p: actual signiﬁcance level; R2: coeﬃcient of determination; SSE: standard error of regression; F: F statistic. EAST: East-West dummy (East German regions 1, West German regions 0); LME:LM error statistic for original model [(3.10) or (3.12)]; DLME: LM error statistic for spatially diﬀerenced model [(3.11) or (3.13)].

258

R. Kosfeld, J. Lauridsen

Table 4. IV Estimation of the SADL model Income model: LGDPCR

Productivity model: LGDPER

Variable

Coeﬃcient

Stand. err.

Coeﬃcient

Stand. err.

Constant W_Y LDTW LHUMAN LNBF EAST W_LDTW W_LHUMAN W_LNFB W_EAST

3.007 0.722* 0.242 0.303** 0.161* )0.369** )0.192 )0.302** )0.208(*) 0.272*

3.136 0.281 0.157 0.041 0.075 0.061 0.334 0.096 0.123 0.127

9.849 0.094 0.039 0.158** 0.034 )0.264** 0.182 0.037 0.001 0.021

26.511 2.416 0.114 0.028 0.121 0.042 0.772 0.456 0.154 0.602

Wald

1373121**

3443117**

Notes: ** 1% signiﬁcance level; 5% signiﬁcance level; p: actual signiﬁcance level. W_Y: spatial lag of LGDPCR resp. LGDPER; W_X: spatial lag of variable X; EAST: East-West dummy (East German regions 1, West German regions 0).

(Fingleton 1999). From the simulation study in Lauridsen (2002), however, we learned that this estimate may be overstated. Human capital and investment intensity in a region exert a positive inﬂuence on regional income per capita, whereas their spatial lags act in the opposite direction. Essential the same holds for control variables comprised in the EAST dummy with a change in sign. This means that in homogenous regional environs the overwhelming part of inﬂuences of regional and lagged exogenous variables are captured by the endogenous spatial lag variable. Only in heterogeneous regional neighbourhoods does explicitly allowing for regional endowments change the picture. The situation turns out to be somewhat diﬀerent in the productivity model. Here we are not faced with the case of near nonstationarity. As in the income model population growth plays only a subordinated role. Moreover, spatial lags of the exogenous variables are not suitable for explaining productivity.12 On the one hand, regional productivity levels can be understood by diﬀerent endowments of human capital. On the other hand, adverse realisations of aforementioned factors in East German regions prove to be still crucial for establishing productivity diﬀerences. The insigniﬁcance of investment intensity may be due to the imperfect indicator problem. Tables 5 and 6 summarise the IV estimated SBA and SEC models, which provide insight in the local dynamics. It is seen from both tables that local diﬀerences in income and productivity are caused by local diﬀerences in the explanatory variables but not by their spatial lags. These results reﬂect accurately the ﬁndings for the SADL models (Table 4) where we have worked exclusively with level variables.

12

This can be inferred although we known that the coeﬃcients and t values are probably slightly downward biased (Lauridsen 2002).

259

Dynamic spatial modelling of regional convergence processes Table 5. IV Estimation of SBA model Income model: DLGDPCR

Productivity model: DLGDPER

Variable

Coeﬃcient

Stand. err.

Coeﬃcient

Stand. err.

Constant W_Y DLDTW DLHUMAN DLNBF DEAST W_LDTW W_LHUMAN W_LNFB W_EAST

3.007 )0.278 0.242 0.303** 0.161* )0.369** 0.050 0.001 )0.047 )0.097

3.136 0.281 0.157 0.041 0.075 0.061 0.308 0.086 0.108 0.105

9.849 )0.906 0.039 0.158** 0.034 )0.264** 0.220 0.196 0.035 )0.243

26.511 2.416 0.114 0.028 0.121 0.042 0.729 0.457 0.251 0.607

Wald

178.991**

97.502**

Notes: ** 1% signiﬁcance level; 5% signiﬁcance level; p: actual signiﬁcance level. W_Y: spatial lag of LGDPCR resp. LGDPER; W_X: spatial lag of variable X; DX: spatial diﬀerence of variable X ; East-West dummy (East German regions 1, West German regions 0).

Table 6. IV Estimation of SEC model Income model: DLGDPCR

Productivity model: DLGDPER

Variable

Coeﬃcient

Stand. err.

Coeﬃcient

Stand. err.

Constant W_LAGYX DLDTW DLHUMAN DLNBF DEAST W_LDTW W_LHUMAN W_LNFB W_EAST

3.007 )0.278 0.242 0.303** 0.161* )0.369** )0.227 )0.277 )0.324 )0.375

3.136 0.281 0.157 0.041 0.075 0.061 0.296 0.241 0.255 0.370

9.849 )0.906 0.039 0.158** 0.034 )0.264** )0.686 )0.710 )0.871 )1.149

26.511 2.416 0.114 0.028 0.121 0.042 1.720 1.962 2.175 3.022

Wald

178.991**

97.502**

Notes: ** 1% signiﬁcance level; 5%: signiﬁcance level; p: actual signiﬁcance level. W_Y: spatial lag of LGDPCR resp. LGDPER, W_X: spatial lag of variable X, DX: spatial diﬀerence of variable X. EAST: East-West dummy (East German regions 1, West German regions 0).

In both models the same reaction coeﬃcient occurs for the expression which includes an endogenous spatial lag. While it returns simply the spatial lag of the dependent variable in the SBA model, in the SEC model it embodies an error-correction mechanism. For both dependent variables, DLGDPCR and DLGDPER, the adjustment coeﬃcient takes a negative sign which generally indicates the ‘‘working’’ of the error-correction mechanism. Since an eﬀective error-correction mechanism drives economies towards a global equilibrium, it is straightforwardly linked with the concept of conditional convergence. Although the adjustment coeﬃcients point to economic forces driving the regional economies towards their steady states, it has not been possible to prove their signiﬁcance. The reason for this may be found in the

260

R. Kosfeld, J. Lauridsen

sharp slowdown of the speed of convergence as of the second half of the 90s. After a strong catching-up process at the beginning of the 90s both the income and productivity gap between West and East Germany has only slightly closed since the mid 90s. This background stresses that our dynamic spatial modelling approach records a ‘‘snapshot’’ of the current functioning of the convergence process. At the end of the 20s century only weak local adjustment processes across German labour markets regions towards a global equilibrium can be established.13 6. Conclusions In this paper a dynamic spatial modelling approach for analysing regional convergence processes is introduced. Instead of tracing adjustment processes in a time sequence, local adjustment to a global equilibrium is investigated. For this we have made use of recently developed concepts of spatial unit roots, spatial cointegration and spatial error-correction. It is shown that alternative dynamic representations of the general spatial distributed lag model (SADL model) provide deeper insights in the spatial dynamics of the economic system underlying regional convergence analysis. Moreover, we highlight how the spatial error-correction model (SEC model) can be estimated in accordance with the properties of the spatial variables. In an application we address the issue of income and productivity convergence in the uniﬁed Germany. Due to expected distortions arising from administrative areal units we refer to labour market regions deﬁned economically on the basis of commuter ﬂows. From a new test strategy for identifying the data generating process of spatial variables spatial stationarity of all model variables is established. Thus, a simple form of the SEC model can be estimated without being liable to encounter the problem of spurious regression. About a decade after German uniﬁcation only weak evidence for conditional convergence is obtained from IV estimation of the SEC models. The lack of signiﬁcance may be due to a slowdown in closing the income and productivity gap in the second half of the nineties. Overall deeper insight into the spatial dimension of regional convergence is obtained by the dynamic spatial modelling approach. References Anselin L (1982) A note on small sample properties of estimators in a ﬁrst-order spatial autocorrelative model. Environment and Planning A14: 1023–1030 Anselin L (1988a) Spatial econometrics: methods and models. Kluwer, Dordrecht Anselin L (1988b) Lagrange multiplier test diagnostics for spatial dependence and spatial heterogeneity. Geographical Analysis 20: 1–17 Bohl MT (1998) Konvergenz westdeutscher Regionen? Neue empirische Ergebnisse auf der Basis von Panle-Einheitswurzeltests. Konjunkturpolitik 44: 82–99

13

The estimation results for the SBE models are suppressed here in view of space limitations. In essence they provide no additional insights on spatial dynamics beyond the ﬁndings from IV estimation of SADL models.

Dynamic spatial modelling of regional convergence processes

261

Charemza W, Deadman D (1992) New directions of econometrics. Edward Elgar, Cheltenham, UK Eckey H-F (2001) Der wirtschaftliche Entwicklungsstand in den Regionen des Vereinigten Deutschlands. Discussion Papers in Economics No. 20/01, University of Kassel, Department of Economics Eckey H-F, Horn K, Klemmer P (1990) Abgrenzung von regionalen Diagnoseeinheiten fu¨r die Zwecke der regionalen Wirtschaftspolitik. Bochum Evans P (1998) Using panel data to evaluate growth theories. International Economic Review 39: 295–306 Evans P, Karras G (1996) Convergence revisited. Journal of Monetary Economics 37: 249– 265 Fingleton B (1999) Spurious spatial regression: some Monte Carlo results with spatial unit Root and Spatial Cointegration. Journal of Regional Science 39: 1–19 Fingleton B (2000) Spatial econometrics, economic geography, dynamics and equilibrium: a ‘third way’? Environment and Planning A, 32: 1481–1498 Fingleton B (2001) Equilibrium and economic growth: spatial econometric models and simulations. Journal of Regional Science 41: 117–147 Funke M, Strulik H (1999) Regional growth in West Germany: convergence or divergence? Economic Modelling 16: 489–502 Greene WH (2002) Econometric analysis, 5th ed. Prentice-Hall, London Gro¨mling M (2001) Produktivita¨tstrends der 90er Jahre. Statistische U¨berzeichnung da¨mpft New Economy Hoﬀnungen. IW-trends 21–37 Holmes MJ (2000) Convergence in international output: evidence from panel data unit root tests. Research Paper No. 00-6, Department of Economics, Loughborough University Islam N (1995) Growth empirics: a panel data approach. Quarterly Journal of Economics 110: 1127–1170 Kelejian HH, Prucha I (1998) A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. Journal of Real Estate Finance and Economics 17: 99–121 Kelejian HH, Robinson DP (1993) A suggested method of estimation for spatial interdependent models with autocorrelated errors, and an application to a county expenditure model. Papers in Regional Science 72: 297–312 Kelejian HH, Robinson DP (1995) Spatial correlation: a suggested alternative to the autoregressive model. In: Anselin L, Florax RJGM (eds) New directions in spatial econometrics. Springer, Berlin Heidal berg New York, pp. 75–95 Ko´nya L (2001) Panel data unit root tests with an application. Discussion Paper, School of Applied Economics, Victoria University, Melbourne, Australia Kosfeld R, Eckey H-F, Dreger Ch (2002) Regional convergence in the uniﬁed Germany: a spatial econometric perspective. Economic Discussion Papers No. 39/02, University of Kassel, Department of Economics Lauridsen J (2002) Spatial autoregressively distributed lag models: equivalent forms, estimation, and an illustrative commuting model. Discussion Paper, Department of Statistics and Demography, University of Southern Denmark. Lauridsen J, Kosfeld R (2002) A test strategy for spurious regression, spatial nonstationarity, and spatial cointegration. Economic Discussions Paper No. 8/2002, Faculty of Social Sciences, University of Southern Denmark Lucas RE (1988) On the mechanics of economic development. Journal of Monetary Economics 22: 3–42 Mankiw NG, Romer D, Weil DN (1992) A contribution to the empirics of economic Growth. Quarterly Journal of Economics 107: 407–437 Niebuhr A (2001) Convergence and the eﬀects of spatial interaction. Jahrbuch fu¨r Regionalwissenschaft 21: 113–133 Romer D (1996) Advanced macroeconomics. New York Schalk HJ, Untiedt G, Lu¨schow J (1995). Technische Eﬃzienz, Wachstum und Konvergenz in den Arbeitsmarktregionen der Bundesrepublik Deutschland (West). Jahrbu¨cher fu¨r Nationalo¨konomie und Statistik 214: 25–49 Seitz H (1995) Konvergenz: Theoretische Aspekte und empirische Befunde fu¨r Westdeutschland. Konjunkturpolitik 41: 25–49

Spatial and supply/demand agglomeration economies: State- and industry-linkages in the U.S. food system Jeﬀrey P. Cohen · Catherine J. Morrison Paul

Abstract. Cost-impacts of spatial and industrial spillovers on economic performance are evaluated by incorporating activity level measures for nearby states and related industries into a cost function model. We focus on localization and urbanization economies for state level food processing industries, from activity levels of similar industries in neighboring states, agricultural input suppliers, and ﬁnal product demand. We ﬁnd signiﬁcant cost-savings from proximity to other food manufacturing centers, and areas with high purchasing power. Cost savings from locating near an agricultural area are also evident, although it seems costly to be located within a rural agricultural state, implying thin market diseconomies. Marginal production costs instead appear higher in more urban, and lower in more rural, areas. These spillover patterns also have input composition implications; materials demand responses are the most closely tracked by the agglomeration cost eﬀects, and capital and labor impacts vary. Key words: Spatial, costs, food, agglomeration JEL classiﬁcation: R300, O300

1. Introduction It is increasingly clear that interconnections among productive entities are substantive, and expanding as we move into a new era of modern production systems. These inter-dependencies have various dimensions, such as spatial and industrial agglomeration eﬀects. They also have important technological and market structure implications; cost economies from such linkages may be driving trends toward urbanization and industrial concentration, and horizontal and vertical consolidation and integration. Understanding the extent J. P. Cohen (B) Barney School of Business, University of Hartford, Department of Economics, Finance and Insurance, West Hartford, CT 06117, USA (e-mail: [email protected]) C. J. Morrison Paul Department of Agricultural and Resource Economics, University of California, One Shields Ave., Davis, CA 95616, USA (e-mail: [email protected]) and member of the Giannini Foundation

264

J. P. Cohen, C. J. Morrison Paul

and consequences of these productive inter-dependencies requires modeling and measuring their impacts. However, conventional production structure analyses are based on models that do not recognize connections or externalities among economic entities, and resulting spillovers aﬀecting economic performance. In this paper we overview and implement a framework for including spillovers in cost and productivity analysis. Our treatment allows for temporal, spatial, and industrial spillovers, through input quasi-ﬁxities, geographic proximity, and horizontal and vertical linkages among own-industry establishments, and suppliers and customers. In particular, the model facilitates characterizing and measuring input rigidities that cause higher short than long run costs, implying economies of ﬂexibility, and spatial and sectoral inter-dependencies underlying thick market or agglomeration cost economies. As recognized by Hall (1989), production is more eﬃcient, or cost eﬀective, when it is concentrated over space as well as time. Such thick market eﬀects or agglomeration economies, that imply aggregate increasing returns (Hall 1990), may result from external knowledge spillovers that cause spatial density to ‘‘enhance the generation of innovation and yield higher rates of technological advance and economic growth’’ (Feldman 1999). Similarly, urbanization and localization economies associated with distance may take the form of cost eﬀects from demanding and supplying sectors. For these and other types of spatial and industrial spillovers, location involves ‘‘a geographic unit over which interaction and communication is facilitated,… and economic activity is enhanced’’ (Feldman). Diseconomies could also arise from counteracting external costs associated with density and urbanization, or with ‘‘ruralization’’ – thin market eﬀects from lack of markets in regions of sparse economic activity. We represent and measure such spillovers using a cost function framework, including activity measures for spatially and industrially linked sectors. Our empirical analysis is based on panel data for the food processing (manufacturing) industry, for the 48 contiguous U.S. states from 1986–1996. Our model recognizes temporal and spatial inter-dependencies within the industry, and externalities from the proximity of supplying and demanding sectors. To accommodate these linkages we include measures of capital constraints, neighboring states’ food processing levels, and own- and (weighted) neighboring states’ agricultural and total production levels, as cost function arguments. This allows us to compute the cost eﬀects of such factors in terms of shadow values. We also adapt the stochastic structure to represent temporal and spatial autoregressive patterns, and control for state size to recognize the greater potential for internalizing such beneﬁts in relatively large states. We ﬁnd evidence of spatial but not temporal spillovers; cost-saving beneﬁts are achieved from locating in a relatively food processing-intensive region (where neighboring states exhibit high levels of food manufacturing activity). Cost economies are also associated with locating in high-demand areas (urbanization economies), and close to rural areas. By contrast, locating directly in an agricultural-intensive state appears costly for producers, suggesting diseconomies associated with rural areas (thin market eﬀects from limited infrastructure support or input markets). We also ﬁnd that the supply and demand own-state eﬀects are reversed in terms of marginal costs, implying diﬀerent ﬁxed as compared to incremental cost patterns.

Spatial and supply/demand agglomeration economies

265

These thick market economies and diseconomies also have cross eﬀects; input composition as well as overall costs is aﬀected by spatial and industrial spillovers. Consideration of the substitutability or complementarity of production factors reveals weak linkages among the external eﬀects. Observed cost patterns associated with external forces seem primarily related to materials demand, with more varied responses across driving factors emerging for capital and particularly labor demand. 2. Representing the cost structure and external spillover eﬀects Modeling and measuring the factors aﬀecting economic performance typically involves specifying and estimating a production or cost function relationship, since performance is fundamentally based on the output producible from a given amount of inputs (primal), or the costs of a given level of output (dual). A cost function represents optimization (input demand) behavior in addition to the technological relationships embodied in the production function, and so becomes a function of the prices of productive (choice or internal) inputs rather than their levels. Otherwise it is a function of the same factors that appear as arguments of the production function. In particular, if externalities have an impact on production, they will also aﬀect the cost relationship, through cost economies. More speciﬁcally, technically eﬃcient production processes are often represented by a production function of the form Y(X,T), where Y is (aggregate) output, X is a vector of inputs, and T is a vector of external factors underlying the existing technological and environmental base, or production structure. The least cost way to produce a given output level may in turn be characterized by a cost function of the form TC(Y), or, more fully, TC(Y,p,K,T) = VC(Y,p,K,T) + SrprKr, where TC is total input cost, VC is variable input cost, p is a vector of observed prices of the variable X inputs, K is a vector of levels of the quasi-ﬁxed X inputs, and pr is the market price of the rth quasi-ﬁxed input. Various exogenous or external factors, including input ﬁxities (temporal linkages) and spatial and industrial spillovers (thick market or agglomeration eﬀects), representing inter-dependencies across time, space, and sector, may be captured as components of the K and T vectors. The cost function representing variable input cost minimization subject to the production function Y(X,T), can be written as the short run cost curve TC(Y;p,K,T). Constraints on K adjustment cause a diﬀerence between short and long run cost curves, so K adjustment implies cost savings (economies) from moving to or toward the long from the short run. If the K factors are instead choice variables in the time frame represented by the data, TC(Y;p,T) characterizes the long run cost curve. Changes in the (external) components of the T vector also generate cost economies if they trigger a downward shift of the cost curve, or diseconomies if they involve an upward shift. And the optimization process imbedded in the cost function implicitly captures the input demand changes, or the substitutability among internal and external productive factors, associated with such cost curve shifts. Modeling and measuring this full set of cost- and cross-eﬀects therefore provides a rich basis for analyzing internal and external cost drivers, and resulting cost and economic performance patterns.

266

J. P. Cohen, C. J. Morrison Paul

Questions about the productive impact of any recognized cost determinant may be addressed in terms of optimizing responses for the internal (variable) factors, or shadow values for the quasi-ﬁxed or external factors. For example, the total cost impact of a change in the price of a variable input reproduces, by Shephard’s lemma, optimal input demand; ¶TC/¶pk ¼ Xk. This measure becomes eTC,pk ¼ ¶ln TC/¶ln pk ¼ Xkpk/TC ¼ Sk in proportional terms, where Sk is the cost share of the jth variable input, and eTC,pk denotes the total cost elasticity with respect to a change in input price pk. The shadow values of output or inputs expressed as levels in the cost function may analogously be computed as ﬁrst order derivatives. For example, ¶TC/¶Y ¼ MC, or eTC,Y ¼ ¶ln TC/¶ln Y ¼ MC•Y/TC, where MC (marginal cost) is the shadow value of Y, and the cost elasticity eTC,Y reﬂects scale economies. Similarly, the net shadow value of the jth quasi-ﬁxed factor Kr, expressed as ¶TC/ ¶Kr ¼ Zr+pr or eTC,Kr ¼ ¶ln TC/¶ln Kr ¼ (Zr+pr)Kr/TC, where Zr ¼ ¶VC/ ¶Kr is the shadow value of Kr, captures the extent of subequilibrium for Kr.1 More to the point for our application, shadow values and corresponding elasticities or proportional impacts may also be computed for the external shift factors in T. They can be measured as ¶TC/¶Tm ¼ Zm, or eTC,Tm ¼ ¶ln TC/¶ln Tm ¼ ZmTm/TC, if Tm is a quantitative variable, and eTC,Tm ¼ ¶ln TC/¶Tm ¼ Zm/TC if Tm is a time counter or qualitative variable. The most common of such measures, representing temporal cost trends, is typically expressed as the elasticity of TC with respect to a time counter t: eTC,t = ¶ln TC/¶t. Or, if time dummies rather than a time trend are included in the T vector, the shift for a particular time period, t1, may be measured as eTC,t1 ¼ ¶ln TC/¶t1.2 Measuring the cost impacts from changes in the various arguments of the cost function – or ‘‘sourcing’’ the drivers of cost patterns – may be accomplished parametrically by empirically estimating the function and directly taking such derivatives. However, if the only shift factor in T is the time trend t, as is typical for production analysis, the ‘‘technical change’’ measure eTC,t essentially becomes a residual, even if it is estimated parametrically.3 The impacts of any cost factors not taken into account as arguments of the function (such as input rigidities or agglomeration eﬀects) cannot be identiﬁed; they are imbedded in the measured contributions of the recognized factors – eTC,t as well as eTC,pk, and eTC,Y (and eTC,Kr if quasi-ﬁxity is recognized). That is, temporal spillovers from input rigidities may generate such problems if short run costs are higher than may be attained in the long run. If the distinction between short and long run behavior is relevant (aﬀects costs), and this is not captured in the cost function speciﬁcation, the eﬀects of the underlying ﬁxities will erroneously be embodied in the elasticity estimates.

1

The shadow values for internal outputs and inputs have optimization implications, since MC ¼ pY and Zr ¼ pr (where pY is the market price of Y) if the Y and Kr markets are perfectly competitive, and if Y and Kr are at their proﬁt-maximizing levels. However, this optimization is not a priori imposed on the model. 2 In this case, for comparison purposes, one time period must provide the basis for analysis – say t0 – so these time derivatives represent the cost diﬀerence compared to t0. 3 That is, rather than explicitly as a residual, as for the Solow residual which is commonly recognized to be a ‘‘measure of our ignorance’’.

Spatial and supply/demand agglomeration economies

267

Temporal linkages might also aﬀect the appropriate stochastic structure, so allowing for an autoregressive process in the stochastic speciﬁcation may be necessary for justiﬁable estimation of the cost relationship. Similar problems arise if spatial and industrial spillovers (that diﬀer over time and location, and across outputs and inputs) exist but are not recognized.4 If such impacts may be substantive, measures representing these externalities should be incorporated as components of the T vector, to facilitate identifying their associated cost economies. Adaptations of the stochastic structure to recognize these inter-dependencies may also be required to generate valid estimates. Such spillovers may arise from various driving forces. Hall and Ciccone (1996) argue that thick markets – aggregate increasing returns or economies associated with spatial or industrial density – could emerge from local geographical externalities or the diversity of local intermediate services. Krugman (1991) and David and Rosenbloom (1990) emphasize that spatial inter-dependencies within and across sectors can enhance productivity through innovation generation or information diﬀusion. Zucker and Darby stress that synergies from human (knowledge) interactions and communication imply thick market eﬀects associated with localized intellectual capital. And Coe and Helpman (1995) underscore the potential for spatial connections from the transmission of quality or ideas embodied in goods through trade to generate agglomeration economies. Thick market eﬀects might well stem from such knowledge spillovers or inter-dependencies in the food processing industry, and thus motivate similar ﬁrms to concentrate in a particular geographic location. If so, we can gain insights about the existence and extent of resulting cost economies by incorporating a measure of own-industry production levels in neighboring localities (states) as a T component in our cost function speciﬁcation.5 Such spatial (horizontal) linkages may also be accommodated in the stochastic structure, similarly to an autoregressive speciﬁcation used to capture temporal inter-dependencies.6 In addition to this spatial thick market dimension, agglomeration eﬀects might arise from proximity to supplying (agricultural producers) or demanding (consumers) sectors (in both own- and neighboring-states). Including measures of these vertically linked sectors’ activity levels in the T vector, expressed in terms of input or output levels of the suppliers or demanders, allows us to represent these inter-dependencies.7 Such sectoral externalities may be interpreted as urbanization and localization economies. If ﬁrms ﬁnd it advantageous to locate close to an area of high

4

Note that estimating elasticities based on Shephard’s lemma many also be misleading if wedges between measured and true economic values of the factors arise from technical and allocative ineﬃciency that might be attributable to speciﬁc factors generating this measured ineﬃciency, as recognized by Atkinson and Halvorsen (1990), Sickes and Streitwieser (1998), and Kumbhakar (2001), among others. 5 For our analysis, we weight these activity measures by land mass to recognize that such spillovers will be less important for a large than a small state. 6 This has been proposed in the spatial econometrics literature by, for example, Anselin (1988), Kelejian and Prucha (1999), and Bell and Bockstael (2000). 7 This is similar in spirit to Bartlesman et. al. (1994) and Paul and Siegel (1999).

268

J. P. Cohen, C. J. Morrison Paul

population density and buying power, implying perhaps greater ﬁnal product demand or infrastructure support, this may be referred to as urbanization economies. If it is cost-saving to locate close to suppliers, this may be thought of as localization economies.8 In the food processing context, urbanization economies may be associated with high potential food demand levels (purchasing power) within a state or from its close neighbors, represented by concentrations of total production (GSP). Localization economies may be generated from high agricultural intensity in a state or its surrounding areas, and the resulting proximity or availability of primary agricultural materials. 3. Empirical implementation of a cost model with spatial and industrial spillovers Accommodating temporal, spatial, and industrial linkages in a cost-based model may thus be accomplished by including production activity levels in linked time periods, areas, or sectors, as cost function arguments, and/or by recognizing them in the stochastic speciﬁcation. To move toward an implementable model, however, we need to be more speciﬁc about these adaptations to the conventional framework. The most familiar of such adaptations is for the temporal dimension, where cost linkages between time periods are due to input stock durability and associated quasi-ﬁxity. Incorporating temporal dependence in the structural model may be accomplished by representing existing stock factors, K, as a ﬁxed input vector that is not optimized over in the short run. If the capital stock K is the one quasi-ﬁxed factor, its productive contribution may be expressed in terms of its shadow value, ZK, and the deviation of short from long run equilibrium captured by the diﬀerence between ZK and pK.9 Timedependence may also be recognized by allowing for autoregressive errors in the stochastic structure. For an AR(1) speciﬁcation, for example, TCt ¼ TC(•)t + ut and ut ¼ qsut-1 + et (where qs is the cost function-speciﬁc AR(1) parameter, et is the white noise period t estimation error for TC, and ut)1 ¼ TCt)1) TC(•)t)1) so substitution for ut results in an estimating equation with an appropriate random error term.10 In our preliminary empirical investigation, we found that allowing for the temporal dimension was not empirically relevant for our data; K shadow values and market prices were not signiﬁcantly diﬀerent, and appending an AR(1) process did not substantively aﬀect the results. This is perhaps a result of the greater spatial (states) than temporal (years) dimension of our data. It is also consistent with the ﬁndings of Goodwin and Brester (1995), Morrison (1997), and Arnade and Gopinath (1998), that adjustment of capital is relatively rapid in this industry, and that this ﬂexibility is increasing. The primary emphasis in the empirical development and estimation below is thus on the

8

These distinctions are common in the Urban Economics literature, as developed and overviewed by Hoover (1948) and O’Sullivan (2000). 9 A more explicitly dynamic model may alternatively be developed by incorporating an indicator of adjustment costs, usually represented by the investment level DK ¼ Kt)Kt)1, as in Morrison (1985), which implicitly brings lagged variables into the cost representation. 10 This adjustment may be written in matrix form for an equation system, as in Berndt (1991).

Spatial and supply/demand agglomeration economies

269

spatial and industry dimensions, although it is useful to recognize the symmetry of the temporal and spatial model adaptations. That is, a spatial externality index representing the dependence of costs in state i on own-industry activity in geographically connected areas may be incorporated as an argument of the cost function, analogously to the inclusion of quasi-ﬁxed input levels to represent temporal inter-dependencies.11 Such an index may be deﬁned as the weighted sum of all state j’s activities (aj ¼ production, input use, or costs) related to that of state i, W Sj„iwi,jaj,O ¼ WAO ¼ AW O , so TC ¼ TC(Y,p,t,AO ). Establishing the cost beneﬁt of adjoining states’ activity thus involves measuring the shadow value ZAO ¼ ¶TC/¶AW O. ‘‘Related to’’ in this case implies being in the same (or ‘‘own’’, denoted by subscript O) industry, but in neighboring states, implying thick market eﬀects with only a spatial dimension. However, connections could also, or alternatively, arise from spillovers or agglomeration eﬀects associated with the activity of vertically linked sectors, as in Bartlesman et al. (1994). Bartlesman et al. recognized productive impacts from weighted sums of ‘‘aggregate activity’’, based on the share of materials received by or supplied to other industries, in a ﬁrst-order model of aggregate U.S. manufacturing. Measures of the externalities Sjwi,jaj,d ¼ AW D (in our notation, where j now denotes industry and d denotes demanding, D, or supplying, S, sector), were imbedded into a ﬁrst-diﬀerenced log-linear production function relationship to identify their performance impact. Paul and Siegel (1999) incorporated analogous measures into a cost function model of the form W W W TC ¼ TC(Y,p,t,AW D , AS ) (where t, AD , and AS are the components of the T vector). Quantifying the impacts of supply- and demand-agglomeration spillovers in this context involves establishing the magnitude and signiﬁcance W of the shadow values ZAWD ¼ ¶TC/¶AW D and ZAwS ¼ ¶TC/¶AS . In addition, spatial linkages may be accommodated through adaptation of the stochastic structure, similarly to that for temporal autocorrelation, to allow for spatial autoregressive (SAR) errors (as developed in the spatial econometrics literature by Anselin, 1988, and others). In this context, spatial inter-connections are deﬁned via lags for geographical location at any one point in time. This is analogous to a standard AR(1) adjustment if only one state’s activity level aﬀects the state under consideration (state i); TCi,t ¼ TC(•)i,t + ui,t, where ui,t ¼ qsuj,t + ei,t, uj,t is the (unadjusted) error term for state j at time t, and ei,t is a white-noise error. If, however, multiple states have productive inter-dependencies, the error structure for state i at time t becomes ui,t ¼ qsSjwi,juj,t + ei,t, or ut ¼ qsWut + et in matrix notation, where W is a weighting matrix and ut is a vector of time-t error terms for each state that has a cost eﬀect on state i. Substituting this expression into the cost equation yields TCt ¼ TC(•)t + qsWut + et, where Wut is a weighted sum of the uj,t from TC(•) estimation for other states (assuming wi,i ¼ 0), which can be rewritten as TCt ¼ TC(•)t + qsTCt ) qsWTC(•)t + et. Determining the

11

Both these (structural and stochastic model) adjustments are made in the spatial econometrics literature, which has primarily focused on linkages of government expenditures across states. So, for example as in Case et al. (1993), W becomes a weighting matrix for other states’ expenditures, Et, in the estimating model, as well as for ut in the stochastic speciﬁcation.

270

J. P. Cohen, C. J. Morrison Paul

‘‘connecting’’ states and deﬁning appropriate weights thus become important issues for implementation of the model. In this study we allow for a combination of spatial and industrial spillovers to empirically capture a web of thick market and agglomeration, or urbanization and localization, economies across states for the U.S. food processing industry. We deﬁne the activity variables aj underlying the spillover factor AW O as production levels in the own (food processing) industry in neighboring states. The weights wij used for the weighted average, as well as the spatial autocorrelation adjustment, give all states bordering state i equal weight, and all other states zero weight.12 Our primary sectoral spillover measures, (unweighted) measures of own-state agricultural production and GSP, AS and AD, represent within-state supply- and demand- agglomeration eﬀects.13 We also allow for supply-side sectoral inter-dependencies from neighboring states, through a weighted sum of agricultural production in states with joint borders, AW S (based on the weights wij used for constructing AW O ). The spillovers variables were normalized by state size, in terms of land mass, to recognize that it is the intensity or density of supplier and demander production levels that drives urbanization and localization economies. Our ﬁnal estimation model is therefore based on a cost function of the W form TC(Y,pN,pP,pM,pK,t,DS,AW O ,AD,AS,AS ), where Y is own-state output from the food manufacturing sector, N,P,K and M are non-production labor, production labor, capital, and intermediate materials inputs, t is a time W counter, DS is a vector of state dummy variables, and AW O , AS , AD, AS, reﬂect (weighted) activity levels of neighboring states in the same and the agricultural sector, and within-state demanders and suppliers. The cost function is assumed to have the ﬂexible generalized Leontief form: W TCðY; PN ; pP ; pM ; pK ; t; DS ; AW O ; AD ; AS ; AS Þ :5 ¼ Rk Ri dki pk Di þ Rk Rl akl p:5 k pl þ Rk dky pk Y þ Rk Rn dkn pk Tn

þ Rk pk ðdYY Y2 þ Rn dnY Tn Y þ Rn Rm dnm Tn Tm Þ

ð1Þ

where i denotes state, k, l the variables inputs N, P, M, K, and m, n the external W shift factors or T components AW O , AS , AD, AS, and t. This total cost function 12

Other weighting structures could alternatively be speciﬁed. For example, although the common approach in the spatial econometrics literature of including all other states as ‘‘neighbors’’ may not seem relevant here, one could postulate that the connection between neighboring states depends on something more than just having a border. For example, it could depend on the amount of ‘‘trade’’ between the neighboring states. We tested such an alternative model, where the impact of neighbors are weighted by the value of shipments of goods between neighboring states (data from the 1992 Commodity Flows Survey, Bureau of Transportation Statistics), and found our simpler speciﬁcation to be preferred. More speciﬁcally, Case, Rosen and Hines (1993) suggested a procedure whereby a new weight W is constructed as a further-weighted average of two potential weights, W ¼ aW1 + (1)a)W2, where W1 and W2 are the 2 diﬀerent candidates for weights, and indicate that the likelihoods of the models as ‘‘a’’ ranges from 0 to 1 may be compared as a test of the relevant weight speciﬁcation. We compared the likelihoods for cases where a ¼ 0 and a ¼ 1, and found that the likelihood drops to )15522 from )14270.9 when the trade weights are used. 13 Lagged values were alternatively tried in order to accommodate possible endogeneity or overlap between the sectors, but this had very little impact on the results.

Spatial and supply/demand agglomeration economies

271

by deﬁnition embodies optimal input demand for N, P, M, K, given Y and T, so Shephard’s lemma may be used to formalize the demand equations: :5 Xk ¼ oTC=opk ¼ Ri dki Di þ Rl akl p:5 l =pk þ dky Y þ Rn dkn Tn

þ dYY Y2 þ Rn dnY Tn Y þ Rn Rm dnm Tn Tm :

ð2Þ

Similarly, the shadow values for the arguments of the function expressed in levels – Y, and Tm – may be expressed as: ZY ¼ MC ¼ oTC=oY ¼ Rk dkY pk þ Rk pk ð2dYY Y þ Rn dnY Tn Þ;

ð3Þ

where MC is the marginal cost of Y, and Zm ¼ oTC=oTm ¼ Rk dkm pk þ Rk pk ðdmY Y þ Rn dnm Tn Þ:

ð4Þ

The system of Eqs. (1) and (2), adapted to incorporate a spatial autoregressive stochastic structure, comprise the estimation model (because MC and Zm are not observable). Equations (1)–(4), however, provide the basis for constructing our measures of the production structure. They allow us to estimate the cost, output value, and input demand-speciﬁc impacts of the spillover factors in T. They also permit estimation of other measures characterizing production processes and behavior, such as scale economies and their input-speciﬁc components, and input substitution patterns. More speciﬁcally, we have seen that the TC(•) function can be used to estimate ﬁrst-order elasticities from the derivatives in (2)–(4), as eTC,pk, eTC,Y, eTC,Tm, and eTC,t. These elasticities represent cost impacts of changes in input prices, output levels, spillovers and temporal or spatial factors. Since the ﬂexible functional form used for estimation embodies a full range of crosseﬀects among the arguments of the function, second order derivatives and elasticities may also be computed to represent interactions among the cost drivers. For example, the impacts of changes in an external factor (or other cost determinant such as pk or t) on marginal costs may be computed as eMC,Tm ¼ ¶ln MC/¶ln Tm. Similarly, input demand responses to a change in Tm are captured by eXk,Tm ¼ ¶ln Xk/¶ln Tm. In reverse, the dependence of the shadow value of Tm on any cost function argument, such as Y, may be computed as: eZm,Y ¼ ¶ln Zm /¶ln Y.

4. Estimation and results The broad range of production cost determinants incorporated in our cost function speciﬁcation allows a variety of such relationships to be estimated and assessed. Our estimates of these measures provide insights about cost patterns and spillover impacts for the U.S. food processing sector, on average across states from 1986 to 1996. Our system of equations comprised of (1) and (2) was estimated by PCTSP using seemingly unrelated regression procedures, for the food processing industries of the 48 contiguous states (an overview of data construction procedures and data summary statistics are presented in Appendices A and B). Allowing for heteroskedasticity, by using robust-White methods to compute the standard errors, made no substantive diﬀerence to the results. Incorporating an AR(1) process in the TC equation and each of the input demand equations also had a negligible impact on the measured indicators, even

272

J. P. Cohen, C. J. Morrison Paul

though all qs estimates (except for the K equation) were statistically signiﬁcant. These adaptations were thus omitted in the ﬁnal speciﬁcation. By contrast, spatial and sectoral linkages seem to be key cost drivers. A SAR adjustment, as described above for TC(•), was made for each equation in the estimating system, leading to separate spatial autocorrelation parameters for each cost and input demand equation. The resulting qs estimates were statistically signiﬁcant (again except for the K equation), and the adaptation had some impact on the elasticity measures (although not substantively in terms of overall patterns). The shadow values of the included spatial and industrial spillover variables were also signiﬁcant. These aspects of the model were thus retained for the ﬁnal reported speciﬁcation. The estimated coeﬃcients for the model are presented in Appendix Table B2 (with t-statistics in italics). The state dummy parameters are omitted to keep the table manageable, but were primarily signiﬁcant. The t-statistics for the remaining coeﬃcients indicate overall statistical signiﬁcance, although the cross-terms for the external eﬀects are largely insigniﬁcant. Omitting these terms, however, did not aﬀect the results substantively, and indicated some joint signiﬁcance. The model was thus left fully ﬂexible, so the signiﬁcance of the complete range of elasticities, each based on a combination of coeﬃcients and their standard errors, could be examined. The R2s (all greater than 0.99) also indicate a very close ﬁt for the equations as a system. Our shadow value and elasticity estimates, capturing the total and marginal cost-eﬀects of changes in the spillover factors and other arguments of the cost function, are provided in Table 1. These and all other reported estimates are computed as (unweighted) averages of the measures across all states, and presented with their standard deviations, and maximum and minimum state values. The standard errors were computed by evaluating the elasticities at the mean values of all the variables in the model. The shadow values themselves are not very interpretable, since they are expressed in levels and thus depend on the units of measurement. Note, however, that on average the estimated spillover eﬀects are signiﬁcantly negative (implying cost-savings) for all external factors except AS – own-state agricultural (supplier) production – for which the measure is signiﬁcantly positive. The latter initially surprising result, indicating that food processing production costs are higher in heavily agricultural states, was quite robust across alternative speciﬁcations. It is also consistent with the fact that state governments increasingly promote food manufacturing activity ‘‘as an economic development strategy designed to counteract eﬀects of rural population decline and job losses’’ (Goetz 1997), implying subsidies for such activity in otherwise less proﬁtable areas. This evidence also provided the motivation for our inclusion in the model of AW S , the measure of neighboring states’ weighted agricultural production, which by contrast indicates cost-saving beneﬁts of proximity to agricultural producers. Overall, these measures document state-level cost economies associated with thick markets from own-industry agglomeration, as suggested by Goetz (1997). This is implied not only by the signiﬁcantly negative (cost-saving) value of ZAWO, based on the extent of food processing activity in neighboring states, but also by the value of ZY, through its implications for scale economies. The average value of eTC,Y ¼ ZYY/TC ¼ ¶ln TC/¶ln Y falls signiﬁcantly short of one, suggesting that more output may be obtained with a less than proportionate cost increase (as also found by Morrison

273

Spatial and supply/demand agglomeration economies Table 1. Shadow values and total and marginal cost elasticities Measure

Estimate

St. dev.

Min

Max

St. error

P-value

ZAWO ZAS ZAWS ZAD ZY=MC eTC,AWO eTC,AS eTC,AWS eTC,AD eTC,Y eTC,t eTC,pN eTC,pP eTC,pM eTC,pK eMC,AWO eMC,AS eMC,AWS eMC,AD eMC,Y eMC,t eMC,pN eMC,pP eMC,pM eMC,pK

)0.0884 0.0159 )0.0103 )0.00013 0.5034 )0.3607 0.3574 )0.7032 )0.1889 0.7401 0.0493 0.0468 0.0743 0.8218 0.0828 )0.0169 )0.0091 )0.0180 0.0302 )0.0164 0.0002 0.0267 0.0491 0.9077 0.0166

0.035 0.002 0.001 0.00005 0.035 0.492 0.438 1.584 0.413 0.109 0.069 0.023 0.022 0.264 0.029 0.008 0.008 0.026 0.043 0.017 0.000 0.013 0.014 0.030 0.004

)0.246 0.010 )0.015 0.000 0.400 )2.916 0.034 )10.895 )3.066 0.566 0.003 )0.190 )0.084 )1.486 0.005 )0.045 )0.050 )0.210 0.001 )0.103 0.000 )0.010 0.013 0.809 0.006

)0.044 0.020 )0.008 0.000 0.632 )0.011 2.691 )0.004 0.028 2.029 0.396 0.154 0.218 2.622 0.313 )0.001 0.000 )0.001 0.220 0.000 0.000 0.070 0.112 0.989 0.028

0.028 0.002 0.002 0.00003 0.014 0.042 0.022 0.031 0.000 0.020 0.001 0.001 0.002 0.004 0.001 0.011 0.006 0.010 0.005 0.005 0.000 0.003 0.003 0.006 0.003

[0.002] [0.000] [0.000] [0.000] [0.000] [0.002] [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] [0.132] [0.148] [0.073] [0.000] [0.001] [0.589] [0.000] [0.000] [0.000] [0.000]

1997).14 These external and internal economies thus have synergistic eﬀects,15 consistent with Gopinath and Vasavada’s (1999) suggestion of signiﬁcant intra-industry spillovers from industry level R&D in this sector, and Paul and Siegel’s (1998) ﬁnding that such impacts are augmented by knowledge capital eﬀects from human and high-tech capital. The reported measures also provide evidence of demand- and supply-side agglomeration economies. Demand-side or urbanization economies are implied by the primarily negative ZAD values. Supply-side or localization economies are associated with agricultural intensity in neighboring states, since ZAWS<0 (for all states), but ‘‘thin market’’ diseconomies arise from locating in too rural a state (in terms of agricultural production per square mile): ZAS>0. The corresponding elasticity measures show that the strongest

14

If one divides the sample into the earlier and later part of the sample (1986–1990 versus 1991– 1996) to identify trends, these economies also seem to be increasing, even though this is a short time series. The average eTC,Y of 0.74 drops from 0.78 to 0.70 between these two time periods. This is the only substantive change in the Y, T, and K elasticities, although some evidence of varying average cost shares for the variable inputs is evident, with that for N dropping, and P and M increasing. 15 Since our dataset has a spatial as well as temporal dimension, this indicates both that expansion of a state’s food processing industry implies lower average production costs, and that states with higher Y levels have lower unit costs of production (given all other cost determinants represented in the function).

274

J. P. Cohen, C. J. Morrison Paul

cost-saving impact is that from proximity to agricultural producers (from eTC,AWS), although the measured beneﬁts vary widely across states. If this combination of external eﬀects is interpreted analogously to a combination of the production of a variety of outputs, the idea of a multioutput scale economy measure may be adapted to aggregate these eﬀects. As developed by Baumol et al. (1982), a multiple output scale economy measure may simply be computed as the sum of the corresponding cost elasticities with respect to output. Such a measure for S outputs, Ys, eTCY ¼ (Ss¶TC/¶Ys • Ys)/TC ¼ SsMCs • Ys/TC ¼ SseTCYr, indicates the cost impact if all rather than just one output increased by 1%. If the external eﬀects are treated similarly, on average the supply-side agglomeration eﬀects from neighboring states, eTC,AWS, outweigh that from the own state, eTC,AS, implying an overall cost-saving beneﬁt from agricultural supply sector externalities. Adding the other measures results in an even larger (in absolute value) eﬀect, indicating that nearly 0.9 % lower costs are implied if on average all spillover factors are 1% higher.16 The ﬁnal total cost elasticities to consider are the eTC,t measures, representing the time trend in food processing costs, and the eTC,pk measures, reﬂecting the input shares. The average eTC,t value suggests increasing costs over time, which is contrary to its usual interpretation as a technical change indicator. However, costs in this sector might well be rising due to increasing food processing, quality, and diversity demanded by consumers. The input elasticities show that intermediate materials comprise a greater proportion of total costs than in other manufacturing industries, which might be expected for food; eTC,pM ¼ 0.82 (82%) on average. Also, the share of production workers exceeds that for non-production workers (0.07 versus 0.05), and the capital share is higher and labor share lower than in aggregate manufacturing (about 0.08 and 0.12 on average). The marginal cost elasticities also provide insights about cost patterns; they represent incremental cost eﬀects, versus the total or average cost eﬀects captured by the TC elasticities. The MC and TC elasticities in Table 1 indicate that increases in own- and supplying-industry production in neighboring states cause marginal as well as average (total, given output) costs to fall, but that marginal costs drop by a smaller proportion; e.g., 0 < eTC,AWO < eMC,AWO. By contrast, the supply and demand average and marginal own-state eﬀects are reversed in sign (eMC,AS < 0 < eTC,AS and eMC,AD > 0 > eTC,AD); greater potential demand in the state implies higher marginal costs on average, and more agricultural intensity, or ‘‘rurality’’, implies lower marginal costs. These spillover factors thus seem to act more as ﬁxed than incremental eﬀects. Marginal costs also appear to be increasing over time, but not signiﬁcantly (either statistically or in terms of magnitude). And intermediate materials are a much larger share of marginal than total costs, at 91%, whereas marginal increases in labor and capital costs to accommodate greater production are only half, and less than one-third, the corresponding average increases.

16

This experiment is not fully justiﬁable, however, at least on average, since each of these measures is evaluated for a particular state and time based on actual levels of external factors. Since the measures vary widely by observation, a simple average and sum is only broadly indicative of the actual aggregate eﬀects.

Spatial and supply/demand agglomeration economies

275

The results discussed so far, representing cost patterns, and in particular cost eﬀects of spatial and industry spillovers, are our primary focus. However, it is also informative to explore their underlying second order implications – the associated input demand and shadow value patterns. The input demand elasticities are presented in Table 2. First, note that all own-elasticities (such as eN,pN for non-production labor) are negative and statistically signiﬁcant, implying theoretically appropriate demand responses to input price changes. Production labor appears to adjust the most in response to a change in its price, and intermediate materials the least. The remaining input price elasticities indicate that substitutability prevails across factors, consistent with Huang (1991) and Goodwin and Brester (1995). Although we will not explore these patterns in depth, some are particularly striking, such as large K responses (as in Morrison 1997) – especially to pP. Higher production labor prices appear to induce mechanization; or areas where production labor wages are low attract less capital-intensive food manufacturing processes or industries. Materials use instead adapts little to changes in the prices of other inputs, although its demand response to pP is the strongest, and to pK the weakest. When production worker wages are high, more intermediate materials are used, perhaps indicating that less time is taken to screen incoming agricultural products, resulting in more waste. Input-speciﬁc scale and time elasticities, eXk,Y and eXk,t, are also presented in Table 2. The Y elasticities indicate that output augmentation is supported primarily by increased materials use, which is consistent with the implications from the marginal cost elasticities.17 By contrast, larger output levels seem to be associated with only slightly higher capital stocks. The t elasticities indicate a fall in the use of non-production workers over time (but not signiﬁcantly), and only a small increase in capital. But P and M demands for a given output level seem to be rising (signiﬁcantly) by 5–6% per year on average, which could be associated with expanded demands for more processed and higher quality ﬁnal food products, including increased packaging. The input demand elasticities with respect to the external factors indicate widely varying input-speciﬁc spillover impacts. The cost-saving impact of own-industry thick markets is related primarily to lower M and P use; more food processing activity in neighboring states actually implies higher N levels (although changes in both types of labor are statistically insigniﬁcant). Costs associated with greater in-state agricultural intensity are also primarily related to M (and to some extent P) use. This may suggest that food processing establishments requiring higher levels of agricultural or other materials, and production workers, are more likely to locate in rural areas. Cost savings of having neighboring agricultural states are also driven by materials use (and some reduction in K), but imply greater labor demand. This perhaps indicates that ﬁrms requiring more labor but less agricultural materials beneﬁt from proximity, but not close connections, to suppliers. And urbanization economies, or the beneﬁts of nearness to demand concentrations, are associated with lower levels of all inputs. Finally, it is useful to consider elasticities of the spillover shadow values, presented in Table 3. Note ﬁrst that the low overall statistical signiﬁcance

17

These are, of course, directly related since they are inverse 2nd order elasticities; eMC,pM is based on the ¶2TC/¶Y¶pM derivative, and eM,Y is based on ¶2TC/¶pM¶Y.

276

J. P. Cohen, C. J. Morrison Paul

Table 2. Input demand elasticities Measure

Estimate

St. dev.

Min

Max

St. error

P-value

eN,AWO eN,AS eN,AWS eN,AD eN,Y eN,t eN,pN eN,pP eN,pM eN,pK

0.0119 0.1500 0.1969 )0.2599 0.4604 )0.0195 )0.8287 0.1051 0.6246 0.0989

1.204 0.823 0.733 1.294 0.293 0.106 2.257 0.279 1.707 0.271

)6.236 )5.341 )2.361 )7.968 )0.233 )0.833 )25.565 0.004 0.021 0.003

7.457 3.790 5.475 7.496 2.541 0.465 )0.028 3.211 19.201 3.152

0.086 0.054 0.079 0.091 0.051 0.003 0.035 0.034 0.055 0.037

[0.057] [0.055] [0.988] [0.000] [0.000] [0.917] [0.000] [0.506] [0.015] [0.566]

eP,AWO eP,AS eP,AWS eP,AD eP,Y eP,t eP,pN eP,pP eP,pM eP,pK

)0.3815 0.2378 0.1451 )0.2255 0.5111 0.0509 0.0597 )1.3494 0.8223 0.4674

0.945 0.305 0.701 0.670 0.172 0.079 0.101 2.252 1.371 0.781

)8.219 )0.640 )1.148 )2.741 0.101 )0.014 0.002 )15.685 0.030 0.017

1.604 1.441 6.944 5.116 1.425 0.579 0.653 )0.050 9.478 5.585

0.051 0.033 0.049 0.054 0.031 0.002 0.020 0.039 0.045 0.035

[0.000] [0.000] [0.830] [0.000] [0.000] [0.000] [0.506] [0.000] [0.000] [0.003]

eM,AWO eM,AS eM,AWS eM,AD eM,Y eM,t eM,pN eM,pP eM,pM eM,pK

)0.4270 0.4055 )0.8997 )0.1941 0.8468 0.0586 0.0306 0.0724 )0.1043 0.0012

0.677 0.540 2.057 0.579 0.151 0.088 0.048 0.111 0.160 0.002

)5.254 0.038 )15.932 )5.226 0.575 0.003 0.001 0.003 )1.006 0.000

)0.009 3.496 )0.004 )0.004 2.780 0.598 0.368 0.677 )0.005 0.013

0.051 0.025 0.036 0.024 0.024 0.001 0.003 0.004 0.006 0.003

[0.010] [0.000] [0.000] [0.003] [0.000] [0.000] [0.015] [0.000] [0.000] [0.934]

eK,AWO eK,AS eK,AWS eK,AD eK,Y eK,t eK,pN eK,pP eK,pM eK,pK

)0.0166 0.1910 )0.3419 )0.2346 0.1618 0.0182 0.0453 0.3860 0.0112 )0.4425

0.275 0.217 0.846 0.594 0.062 0.022 0.061 0.528 0.015 0.604

)1.646 0.017 )4.641 )4.127 0.046 0.000 0.002 0.018 0.001 )2.934

0.925 1.108 )0.003 0.006 0.407 0.104 0.276 2.620 0.066 )0.021

0.039 0.026 0.037 0.026 0.025 0.001 0.020 0.032 0.033 0.034

[0.477] [0.000] [0.020] [0.000] [0.000] [0.000] [0.566] [0.003] [0.934] [0.001]

levels for these elasticities, and especially for cross-eﬀects between external factors. In particular, the ZAWS elasticities are all insigniﬁcant except eZAWS,pM and eZAWS,pK. This implies that a rise in pM (and to a lesser extent pK) increases the value of proximity to agricultural production. Note also that the only spillover shadow value that does not increase (in absolute value) signiﬁcantly with pK is ZAWO, and all increase signiﬁcantly with pM. Additional insights may be gained from the Y and t elasticities in this Table. Higher levels of food processing production weakly imply greater value associated with closeness to other food processing activity, and to neighboring state’s agricultural activity, as exhibited by the eZAWO,Y and

277

Spatial and supply/demand agglomeration economies Table 3. Shadow value elasticities Measure

Estimate

0.0917 eZAWO,Y 0.0081 eZAWO,t eZAWO,AWO )0.2293 0.0712 eZAWO,AS eZAWO,AWS )0.0425 0.1791 eZAWO,AD 0.0266 eZAWO,pN 0.0935 eZAWO,pP 0.8717 eZAWO,pM 0.0082 eZAWO,pK

St. dev. 0.077 0.002 0.140 0.052 0.049 0.187 0.076 0.062 0.161 0.024

Min 0.002 0.003 )0.644 0.004 )0.398 0.006 )0.138 )0.040 0.499 )0.047

Max

St. error P-value

0.318 0.014 )0.020 0.274 )0.002 0.760 0.210 0.277 1.219 0.064

0.071 0.004 0.150 0.084 0.046 0.086 0.033 0.042 0.068 0.023

[0.180] [0.062] [0.183] [0.407] [0.350] [0.005] [0.104] [0.007] [0.000] [0.459]

eZAS,Y eZAS,t eZAS,AWO eZAS,AS eZAS,AWS eZAS,AD eZAS,pN eZAS,pP eZAS,pM eZAS,pK

0.0386 0.0120 )0.0519 )0.0151 )0.0176 )7.65805D-09 0.0227 0.0514 0.8855 0.1486

0.041 )0.251 )0.001 0.001 0.010 0.017 0.026 )0.135 )0.004 0.013 )0.089 0.000 0.028 )0.212 0.000 8.17740D-10 )1.08155D-08 )6.20376D-09 0.029 )0.082 0.105 0.027 )0.057 0.109 0.061 0.771 1.124 0.020 0.054 0.181

0.026 0.002 0.059 0.039 0.022 0.023 0.013 0.013 0.025 0.011

[0.157] [0.000] [0.392] [0.711] [0.458] [0.104] [0.054] [0.000] [0.000] [0.000]

eZAWS,Y eZAWS,t eZAWS,AWO eZAWS,AS eZAWS,AWS eZAWS,AD eZAWS,pN eZAWS,pP eZAWS,pM eZAWS,pK

0.0699 )0.0022 )0.0315 0.0160 0.0094 )3.91817D-09 )0.0023 0.0022 0.9613 0.1423

0.063 0.002 0.330 0.000 )0.003 )0.002 0.015 )0.072 )0.003 0.012 0.001 0.078 0.013 0.000 0.093 4.14711D-10 )5.50797D-09 )2.75844D-09 0.022 )0.049 0.088 0.023 )0.054 0.089 0.049 0.771 1.071 0.016 0.108 0.209

0.042 0.002 0.033 0.022 0.016 0.018 0.019 0.020 0.038 0.015

[0.079] [0.346] [0.338] [0.458] [0.563] [0.303] [0.988] [0.828] [0.000] [0.011]

eZAD,Y eZAD,t eZAD,AWO eZAD,AS eZAD,AWS eZAD,AD eZAD,pN eZAD,pP eZAD,pM eZAD,pK

)0.2330 0.0144 0.5607 0.1005 )0.1017 )0.9343 0.0855 0.1343 0.6870 0.0933

2.003 0.060 2.824 0.529 0.692 7.743 0.859 0.681 1.530 0.014

0.062 0.003 0.089 0.042 0.034 0.034 0.040 0.041 0.088 0.022

[0.000] [0.002] [0.000] [0.117] [0.325] [0.000] [0.000] [0.000] [0.000] [0.000]

)26.828 )0.258 )12.429 )2.386 )12.134 )135.472 )14.523 )11.426 )6.268 0.023

13.094 1.043 48.683 9.194 3.077 35.724 4.120 3.124 26.610 0.339

eZAWS,Y elasticities (but they are not statistically signiﬁcant). eZAS,Y may be similarly interpreted, although it is the opposite sign (since the sign of ZAS is reversed). By contrast, states with greater food processing intensity reap less beneﬁts from proximity to areas with higher demand concentrations. In the temporal dimension, cost savings beneﬁts from both demand-side agglomeration and own-industry thick market impacts have increased over the time frame of our analysis. The disadvantages of being in a rural area also are rising, and the cost-savings from proximity to suppliers falling. This suggests that over time the ‘‘draws’’ of urbanization economies and thick

278

J. P. Cohen, C. J. Morrison Paul

market eﬀects are motivating food processing industries to move away from rural or agricultural areas. 5. Concluding remarks In this paper we have estimated and assessed spatial and industrial spillover eﬀects in the U.S. food system. Our focus is on state-level food processing production, with thick market eﬀects from neighboring states’ own-industry activity levels, and supply- and demand-agglomeration eﬀects from proximity to high food demand concentrations and agricultural-intensity, both within and across states. We ﬁnd statistically signiﬁcant cost impacts of these spillover eﬀects, although the supplier-eﬀect is a combination of beneﬁts from having neighboring states with high agricultural levels, and costs of high agricultural intensity in the own state. This latter result might be interpreted as a ‘‘thin markets’’ eﬀect arising from the disadvantages of being in too rural an area, such as low infrastructure levels (e.g., roads or telecommunications), and limited labor and capital pools or markets. Increasing returns to scale (or to being in a state with a higher level of food processing activity), and greater processing costs over time (possibly due to increasing levels of processing, quality, and diﬀerentiation of food products), are also evident. And apparent diﬀerences between total and marginal cost eﬀects imply that there is a greater proportion of materials than other input costs at the margin, and that within-state supply and demand cost impacts have more a ﬁxed eﬀects than incremental nature. Measured second-order relationships underlying these cost eﬀects indicate that the speciﬁcation is generating (theoretically and conceptually) reasonable representations of production processes. The elasticities also reveal negligible cross-eﬀects among the spillover factors, but clear diﬀerentiation among input responses to variations in spillovers, as well as diﬀerent output levels, input prices, and time period. Our results thus provide provocative evidence about the insights that may be gained from incorporating spatial and sectoral spillovers in production analyses; such factors seem to be key economic performance determinants, although they are not allowed for in conventional production models. Appendix A: The data Labor quantities: The number of workers engaged in production (PL) at operating manufacturing establishments, and the number of full-time and part-time employees (TOTAL) on the payrolls of these manufacturing establishments, are from the U.S. Census Bureau’s Annual Survey of Manufactures (ASM), Geographic Area Statistics. Total number of non-production workers (NL) are obtained as the diﬀerence between TOTAL and PL. Wage bills: The ASM reports wages paid to production workers and gross earnings of all employees on the payroll of operating manufacturing establishments. Wage bill for NL is obtained by subtracting the wages paid to PL from the gross earnings of all employees. Nonproduction wage is obtained by dividing the nonproduction wage bill by NL. Production wage is obtained by dividing the production wage bill by PL.

Spatial and supply/demand agglomeration economies

279

Public capital stock: Following Eberts et al. (1986), the perpetual inventory technique was applied to state-level public infrastructure investment data to generate highway capital stock estimates. Discards were assumed to follow a truncated normal distribution, with the truncation occurring at one half the average life and one and one half times the average life. The Federal Highway Administration’s composite price index was used to deﬂate the capital and maintenance outlay series. Private capital stock: The perpetual inventory method was applied to data on state level new capital expenditures from the ASM, with the initial capital stock (1982) values taken from Morrison and Schwartz (1996). Depreciation rates for capital equipment are from the Bureau of Labor Statistics, Oﬃce of Productivity and Technology. The investment deﬂator was obtained from the Bureau of Labor Statistics and is their input price deﬂator for total manufacturing (SIC 20–39) capital services. The price of capital is obtained as (it+dt) •qK,t[1/(1taxratet)], where dt is the depreciation rate, it is the Moody’s Baa corporate bond rate (obtained from the Economic Report of the President), qK,t is the investment deﬂator, and taxratet is the corporate tax rate (obtained from the Oﬃce of Multifactor Productivity, Bureau of Labor Statistics). Materials: The ASM reports direct charges actually paid or payable for items consumed or put into production during the year. The quantity of materials is obtained by deﬂating these charges by the ratio of nominal Gross Domestic Product to real Gross Domestic Product as reported on the Bureau of Economic Analysis website. This deﬂator is also used as the price of materials. Output: Value of state-level shipments reported in the ASM were deﬂated by manufacturing Gross State Product deﬂators for each state (provided by Standard & Poor’s DRI). Spatial weights: Value of goods shipped data from state of origin to state of destination are from the 1992 Commodity Flows Survey, U.S. Bureau of Transportation Statistics. Appendix B: Tables Table B1. Summary statistics

TC Y N P M K pN pP pM pK AW O AD AS AWS

Mean

St. deviation

Min

Max

6233.09 9176.72 306.38 513.17 5334.15 1855.53 0.8742 0.8835 0.9382 0.2721 9278.65 4840202.93 70610.29 106906.93

6154.53 9232.43 320.59 529.92 5245.19 1913.74 0.1392 0.1038 0.0447 0.0073 4362.12 7408846.59 54263.11 144625.77

164.41 226.38 2.35 10.25 120.79 79.77 0.4240 0.6189 0.8501 0.2573 766.16 100511.10 2612.98 3668.58

37095.11 52671.01 1902.40 3037.49 29481.90 10154.70 1.7177 1.3730 1.0000 0.2825 21481.21 3.85144D+07 324824.44 1054409.75

AD, AS, and AWS are normalized by land area, in terms of million square miles.

J. P. Cohen, C. J. Morrison Paul

280 Table B2. Coeﬃcient estimates (t statistics in italics) aN,L aN,M aN,K aP,M aP,K aK,M dN,Y dP,Y dM,Y dK,Y dN,t dP,t dM,t dK,t dN,WS dP,WS dM,WS dK,WS dN,WO dP,WO dM,WO dK,WO dN,S dP,S dM,S dK,S dN,D dP,D dM,D dK,D dY,Y

1.39E+01 7.89E+01 2.32E+01 1.85E+02 1.95E+02 5.56E+00 2.03E-02 3.28E-02 4.91E-01 3.55E-02 )1.71E+00 6.19E+00 7.34E+01 8.38E+00 1.31E-04 7.72E-05 )1.04E-02 )1.35E-03 7.97E-04 )5.17E-03 )7.03E-02 7.08E-04 8.94E-04 1.41E-03 1.54E-02 2.84E-03 )2.24E-05 )2.75E-05 )7.59E-05 )4.32E-05 )1.49E-07

0.67 2.42 0.57 4.12 3.00 0.08 6.25 10.01 34.81 6.14 )1.45 5.24 9.32 3.37 0.48 0.27 )5.69 )2.06 0.14 )0.90 )2.36 0.08 1.99 3.11 7.91 3.64 )3.26 )4.04 )2.91 )4.14 )3.34

dWS,WS dS,S dWO,WO dD,D dD,Y dD,WO dD,S dY,WO dY,S dY,D dD,WS dY,WS dWO,WS dS,WS dD,t dY,t dWO,t dWS,t dS,t q qN qP qM qK

)1.51E-10 )5.44E-10 3.19E-07 7.30E-13 1.10E-09 )1.48E-09 )4.07E-11 )3.07E-07 )2.15E-08 )2.94E-08 1.35E-11 )2.83E-08 1.18E-08 )8.08E-10 )4.06E-07 2.86E-05 )2.18E-04 7.75E-06 6.37E-05 0.369797 0.39307 0.280306 0.375107 6.66E-04

R2s

TC N P M K

)0.59 )0.37 1.44 6.66 7.32 )6.37 )1.64 )1.50 )1.44 )0.86 1.03 )1.80 0.96 )0.75 )4.79 0.54 )2.37 0.95 6.03 9.50 7.73 5.45 9.47 0.98 0.9940 0.9939 0.9973 0.9916 0.9971

References Anselin L (1988) Spatial econometrics: Methods and models. Kluwer Academic Publishers, Boston Arnade C, Gopinath M (1998) Capital adjustment in U.S. agriculture and food processing: A cross sectoral model. Journal of Agricultural and Resource Economics 23(1):85–98 (July) Atkinson SE, Halvorson R (1990) Tests of allocative eﬃciency in regulated multi-product ﬁrms. Resources and Energy 12(1):65– 77 (April) Bartlesman E, Caballero RJ, Lyons RK (1994) Customer- and supplier-driven externalities. American Economic Review 84(4):1075–1084 Baumol WJ, Panzar JC, Willig RD (1982) Contestable markets and the theory of industry structure. Harcourt Brace Jovanovich, New York Bell KP, Bockstael NE (2000) Applying the generalized-moments estimation approach to spatial problems Involving Microlevel Data. The Review of Economics and Statistics 82(1):72–82 (February) Berndt ER (1991) The practice of econometrics. Addison Wesley publishing Company, Boston, MA Case AC, Rosen HS, Hines JR, Jr (1993) Budget spillovers and ﬁscal policy interdependence. Journal of Public Economics 52:285–307 Ciccone A, Hall RE (1996) Productivity and the density of economic activity. American Economic Review 86(1):54–70 (March) Coe DT, Helpman E (1995) International R&D Spillovers. European Economic Review 39: 859–887

Spatial and supply/demand agglomeration economies

281

David P, Rosenbloom J (1990) Marshallian factor markets, externalities, and the dynamic of industrial localization. Journal of Urban Economics 28:349–370 Feldman MP (1999) The new economics of innovation, spillovers and agglomeration: A review of empirical studies. Economics of Innovation and New Technology 8(1–2):5–25 Goetz SJ (1997) State- and county-level determinants of food manufacturing establishment growth: 1987–1993. American Journal of Agricultural Economics 79(3):838–850 (August) Goodwin BK, Brester GW (1995) Structural change in factor demand relationships in the U.S. Food and kindred products industry. American Journal of Agricultural Economics 77(1): 69–79 (Febraury) Gopinath M, Vasavada U (1999) Patents, R&D, and market structure in the U.S food processing industry. Journal of Agricultural and Resource Economics 24(1):127–39 (July) Hall RE (1989) Temporal agglomeration. National Bureau of Economics Research, Working paper #3143. October Hall RE (1990) Invariance properties of Solow’s productivity residual. In: Diamond P (ed) Growth/productivity/employment: Essays to celebrate Bob Solow’s birthday. MIT Press, Cambridge, MA Hoover EM (1948) The location of economic activity. McGraw-Hill, New York Huang KS (1991) Factor demands in the U.S. food-manufacturing industry. American Journal of Agricultural Economics 73(3):615–620 (August) Kelejian HH, Prucha IR (1999) A generalized moments estimator for the autoregressive parameter in a spatial model. International Economic Review 40(2):509–533 (May) Krugman P (1991) Increasing returns and economic geography. Journal of Political Economy 99: 483–499 Kumbhakar SC (2001) Estimation of proﬁt functions when proﬁt is not maximum. American Journal of Agricultural Economics 83(1):1–19 (Febraury) Morrison CJ (1985) Primal and dual capacity utilization: An application to productivity measurement in the U.S. automobile industry. Journal of Business and Economic Statistics 3(4):312–324 (October) Morrison CJ (1997) Structural change, capital investment and productivity in the food processing industry. American Journal of Agricultural Economics 79(1):110–125 (February) O’Sullivan A (2000) Urban economics, 4th ed. McGraw-Hill publishers, New York Paul C, Morrison J, Siegel D (1998) knowledge capital and cost structure in the US food and ﬁber industries. American Journal of Agricultural Economics 80(1):30–45 (February) Paul C, Morrison J, Siegel D (1999) Scale economies and industry agglomeration externalities: A dynamic cost function approach. American Economic Review 89(1):272– 290 (March) Sickles RC, Streitwieser, ML (1998) An analysis of technology, productivity, and regulatory distortion in the interstate natural gas transmission industry: 1977– 1985. Journal of Applied Econometrics 13(4):377– 395 (July/Aug) Zucker LG, Darby MR (1998) Capturing technological opportunity via Japan’s star scientists: Evidence from Japanese ﬁrms’ biotech patents and products. Journal of Technology Transfer 26(1–2):37–58 (January)