Jowell-Prelims.qxd
3/9/2007
5:04 PM
Page i
Measuring Attitudes CrossNationally
Jowell-Prelims.qxd
3/9/2007
5:04 PM
Page ii
The contributors Jaak Billiet Professor of Social Methodology at the Katholieke Universiteit Leuven, Centre for Sociological Research James Davis Senior Lecturer in Sociology at the University of Chicago and Senior Research Scientist at the National Opinion Research Center Gillian Eva Research Fellow at City University and a member of the ESS Central Coordinating Team (CCT) Rory Fitzgerald Senior Research Fellow at City University and a member of the ESS CCT Irmtraud Gallhofer Member of the ESS CCT and senior researcher at ESADE Business School, Universitat Ramon Llull, Barcelona Sabine Häder Senior Statistician at Zentrum für Umfragen, Methoden und Analysen (ZUMA), Mannheim, Germany and a member of the European Social Survey sampling panel Janet A. Harkness Senior Research Scientist at ZUMA, Mannheim, Germany and Director of the Survey Research and Methodology Program at the University of Nebraska, USA Bjørn Henrichsen Director at Norwegian Social Science Data Services Roger Jowell Research Professor at City University London and Principal Investigator of the European Social Survey (ESS) Max Kaase Emeritus Professor of Political Science at the University of Mannheim, past President of the International Political Science Association, and chair of the ESS Scientific Advisory Board Achim Koch Senior Researcher at the European Centre for Comparative Surveys (ECCS) at ZUMA, Mannheim, Germany
Kirstine Kolsrud Senior Adviser at Norwegian Social Science Data Services Peter Lynn Professor of Survey Methodology at the University of Essex, UK and a member of the European Social Survey sampling panel Peter Mohler Director of Zentrum für Umfragen, Methoden und Analysen (ZUMA), Mannheim, Germany and Professor at Mannheim University José Ramón Montero Professor of Political Science at the Universidad Autónoma de Madrid and the Instituto Juan March, Madrid Kenneth Newton Professor of Comparative Politics at the University of Southampton and Visiting Fellow at the Wissenschaftszentrum Berlin Pippa Norris Director of the Democratic Governance Group, United Nations Development Program and the McGuire Lecturer in Comparative Politics, Harvard University Michel Phillippens Former research assistant at the Katholieke Universiteit Leuven, Centre for Sociological Research Willem E. Saris Member of the ESS CCT and Professor at the ESADE Business School, Universitat Ramon Llull, Barcelona Shalom H. Schwartz Sznajderman Professor Emeritus of Psychology at the Hebrew University of Jerusalem, Israel Knut Kalgraff Skjåk Head of Department at Norwegian Social Science Data Services Ineke Stoop Head of the Department of Data Services and IT at the Social and Cultural Planning Office of the Netherlands
Jowell-Prelims.qxd
3/9/2007
5:04 PM
Page iii
Measuring Attitudes CrossNationally Lessons from the European Social Survey
EDITORS
Roger Jowell, Caroline Roberts, Rory Fitzgerald and Gillian Eva
Centre for Comparative Social Surveys at City University, London
Jowell-Prelims.qxd
3/9/2007
5:04 PM
Page iv
© Centre for Comparative Social Surveys, City University, London 2007 First published 2007 Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act, 1988, this publication may be reproduced, stored or transmitted in any form, or by any means, only with the prior permission in writing of the publishers, or in the case of reprographic reproduction, in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. SAGE Publications Ltd 1 Oliver’s Yard 55 City Road London EC1Y 1SP SAGE Publications Inc. 2455 Teller Road Thousand Oaks, California 91320 SAGE Publications India Pvt Ltd B 1/I 1 Mohan Cooperative Industrial Area Mathura Road, Post Bag 7 New Delhi 110 044 SAGE Publications Asia-Pacific Pte Ltd 33 Pekin Street #02-01 Far East Square Singapore 048763 Library of Congress Control Number: 2006932622 British Library Cataloguing in Publication data A catalogue record for this book is available from the British Library ISBN 978-1-4129-1981-4
Typeset by C&M Digitals (P) Ltd, Chennai, India Printed in Great Britain by Athenaeum Press, Gateshead Printed on paper from sustainable resources
Jowell-Prelims.qxd
3/9/2007
5:04 PM
Page v
Contents Foreword
xi
1
1 1
2
The European Social Survey as a measurement model Roger Jowell, Max Kaase, Rory Fitzgerald and Gillian Eva Introduction In defence of rigour The pursuit of equivalence The ESS model in practice Continuity Governance Division of tasks Workpackages 1 and 2: Overall project design and coordination Workpackage 3: Sampling Workpackage 4: Translation Workpackage 5: Commissioning fieldwork Workpackage 6: Contract adherence Workpackage 7: Piloting and data quality Workpackages 8 and 9: Question reliability and validity Workpackage 10: Event monitoring Workpackage 11: Data access and aids to analysis Conclusion Notes References
1 4 6 9 11 12 15 15 16 17 18 19 21 22 24 24 26 27 29
How representative can a multi-nation survey be? Sabine Häder and Peter Lynn
33
Introduction Equivalence of samples Sample sizes Achieving equivalence Population coverage Sampling frames Sample designs Design weights Design effects Sample size Organisation of the work Conclusion References
33 34 36 37 38 38 40 43 44 49 50 51 52
Jowell-Prelims.qxd
vi
3
4
3/9/2007
5:04 PM
Page vi
MEASURING ATTITUDES CROSS-NATIONALLY
Can questions travel successfully? Willem E. Saris and Irmtraud Gallhofer
53
Introduction Seven stages of questionnaire design Background to the evaluation of questions Evaluation of ‘concepts-by-intuition’ Quality criteria for single survey items The Multitrait-Multimethod design Predicting the quality of questions Evaluation of ‘concepts-by-postulation’ Political efficacy The Human Values Scale An evaluation of cross-cultural comparability Conclusion References Appendix
53 54 56 57 58 60 61 61 62 65 68 71 72 75
Improving the comparability of translations Janet A. Harkness
79
Introduction Source and target languages Organisation and specification Organisation Specification The Translation Procedure: TRAPD Split and parallel translations Countries with more than one language Producing multiple translations Sharing languages and harmonisation Ancillary measures to support translation Annotating the source questionnaire Query hotline and FAQs Documentation templates Lessons learned Source questionnaire and translation Advance translation Templates and production tools Attention to detail Identifying translation errors Conclusion References
79 80 81 81 81 83 84 85 85 87 87 87 88 88 89 89 89 90 90 91 91 92
Jowell-Prelims.qxd
3/9/2007
5:04 PM
Page vii
Contents
5
6
7
If it bleeds, it leads: the impact of media-reported events Ineke Stoop
vii
95
Introduction “Events, dear boy, events” Events in the media News flow and event identification Guidelines and database Meanwhile, what was happening in Europe? Looking ahead Notes References
95 97 98 100 102 105 108 110 111
Understanding and improving response rates Jaak Billiet, Achim Koch and Michel Philippens
113
Introduction Response quality: standards and documentation The conduct of fieldwork Response and non-response Why such large country differences in response rates? Country differences in non-contact rate reduction Contact procedures Number of contact attempts Contactability Country differences in refusal conversion Differentiation of respondents according to readiness to co-operate Estimation of non-response bias Conclusion References Appendix
113 115 117 118 120 121 122 122 124 126 129 129 132 133 136
Free and immediate access to data Kirstine Kolsrud, Knut Kalgraff Skjåk and Bjørn Henrichsen
139
Introduction Data access barriers Standardising the production of data and meta data The data The survey documentation Dissemination Conclusion References
139 140 142 142 146 149 155 156
Jowell-Prelims.qxd
viii
8
9
3/9/2007
5:04 PM
Page viii
MEASURING ATTITUDES CROSS-NATIONALLY
What is being learned from the ESS? Peter Mohler
157
Introduction Consistency Transparency Coordination and management Innovative probability samples A source of data on error and bias Translation Free and easy access to data Capacity building Conclusion References
157 159 160 161 162 162 164 164 165 166 167
Value orientations: measurement, antecedents and consequences across nations Shalom H. Schwartz
169
Introduction The nature of values Current survey practice and the conception of values A theory of the content and structure of basic human values Ten basic types of value The structure of value relations Comprehensiveness of the ten basic values But are self-reports valid indicators of values? Measuring values in the ESS Development of the Human Values Scale Methodological issues in designing the scale Correcting for response tendencies Reliability of the ten values Value structures in the ESS countries Value priorities in the ESS countries Sources of individual differences in basic values Age and the life course Gender Education Income Basic values as a predictor of national and individual variation in attitudes and behaviour Attitudes to immigration Interpersonal trust Social involvement
169 170 172 173 173 174 176 177 177 177 179 180 181 182 184 188 188 189 189 190 190 190 192 193
Jowell-Prelims.qxd
3/9/2007
5:04 PM
Page ix
Contents
10
11
ix
Organisational membership Political activism Conclusion References Appendix 1 Appendix 2 Appendix 3
194 195 196 197 201 202 203
Patterns of political and social participation in Europe Kenneth Newton and José Ramón Montero
205
Introduction Individual participation: fragmented and multi-dimensional National levels of participation: also fragmented and multi-dimensional? Types of participation Participation in voluntary associations Social and helping behaviour Conventional political participation Protest politics Overall participation What explains the national patterns? Conclusion References Appendix 1 Appendix 2 Appendix 3 Appendix 4 Appendix 5 Appendix 6 Appendix 7
205 206 209 210 210 214 217 219 221 223 227 229 230 231 233 234 235 236 237
A continental divide? Social capital in the US and Europe Pippa Norris and James Davis
239
Introduction Toquevillian theories of social capital Social networks and social trust matter for societal co-operation Social capital has importance consequences for democracy Social capital has declined in post-war America Social capital in advanced industrialised societies Evidence and measures Comparing social capital in Europe
239 241 242 243 243 247 249 251
Jowell-Prelims.qxd
3/9/2007
x
5:04 PM
Page x
MEASURING ATTITUDES CROSS-NATIONALLY
Cohort analysis of social capital Conclusions References Appendix
255 261 262 264
Index
265
Jowell-Prelims.qxd
3/9/2007
5:04 PM
Page xi
Foreword This book describes the product of a remarkable collaboration across national borders between researchers and funders whose singular purpose has been to build a regular and rigorous means of charting attitudinal and behavioural change in a changing Europe. The project’s starting point (and its continual pre-occupation) has been to find ways of tackling the longstanding and seemingly intractable difficulties of achieving equivalence in comparative social surveys. This volume is about the problems facing comparative social research generally and new approaches to finding solutions. Almost all chapters have been written by one or more of the primary architects and initiators of the European Social Survey (ESS). Each chapter deals with a particular aspect of comparative social surveys – from sampling to translation, response rate enhancement to harmonisation of data, and so on – tracing the difficulties and describing how the ESS attempts to solve them. Chapter 1 records the origins of the European Social Survey, its underlying philosophy and purpose. It also introduces and summarises its many innovations – both methodological and organisational. Chapter 2 discusses the obstacles to achieving equivalent random samples within different countries. It documents the ESS’s unprecedented approach to achieving a viable solution. Chapter 3 describes the unusual collection of hoops through which ESS questions have to pass before they are adopted as part of the questionnaire, warning of the hazards of less rigorous approaches. Chapter 4 documents the unusual procedures and protocols employed in the ESS to obtain equivalent translations from the source questionnaire into well over 20 languages, contrasting ESS methods with alternative approaches. Chapter 5 reviews the possible impact of major national or international events on attitudinal trend data and describes the methods the ESS has developed to monitor and record such events with the purpose of informing subsequent data analyses. Chapter 6 is about patterns of declining response rates in surveys, and the particular problem of differential response rates in cross-national surveys. It describes the range of counteractive measures taken in the ESS and assesses their effectiveness.
Jowell-Prelims.qxd
xii
3/9/2007
5:04 PM
Page xii
MEASURING ATTITUDES CROSS-NATIONALLY
Chapter 7 tackles the formidable difficulty of producing an equivalent, user-friendly and timely dataset in the same form from over 20 separate countries. It outlines the meticulous procedures and protocols employed by the ESS to achieve this. Chapter 8 assesses what lessons we are learning from the various ESS innovations in methodology and organisational structure, acknowledging what has already been learned from predecessor cross-national social surveys. Chapter 9 outlines the origins and development of the ‘human values scale’ employed in the ESS and demonstrates its utility for mapping the structure of values across nations. Chapter 10 analyses the results of the rotating module in Round 1 of the ESS on citizen involvement and democracy, showing distinctly different national patterns of participation in both voluntary and political activity. Chapter 11 compares ESS data with data from the US General Social Survey to investigate to what extent the well-documented ‘crisis’ of declining social capital in the US applies to European nations too. The huge debts we owe to colleagues throughout Europe are too numerous to itemise here. The organisational structure of the ESS means that in each of 32 countries there are numerous individuals and organisations that have taken on the task of making the ESS a success in their own country. They include, above all, the National Coordinators who orchestrate the work in their country and who generously contribute their ideas and expertise, the survey agencies that carry out the fieldwork and data preparation to remarkably high standards, and, of course, the national funding agencies that have consistently financed successive rounds of fieldwork and coordination in their country. In addition, members of our various advisory boards and committees – the Scientific Advisory Board, the Funders’ Forum, the Methods Group, the Sampling Panel and the Translation Panel – have played an invaluable role in helping to secure and sustain the quality of the project. We greatly appreciate their respective contributions and realise how much we depend on them – individually and collectively – to help us manage such a large and complex multinational enterprise. As for the production of the book itself, we have relied heavily on the talents and meticulousness of Sally Widdop, a research assistant at our Centre, who has kept us on track and told us precisely what to do – for all of which we owe her a heartfelt vote of thanks. Editors Roger Jowell Caroline Roberts Rory Fitzgerald Gillian Eva
Jowell-Chapter-01.qxd
1
3/9/2007
8:03 PM
Page 1
The European Social Survey as a measurement model Roger Jowell, Max Kaase, Rory Fitzgerald and Gillian Eva∗
Introduction The importance to social science of rigorous comparative research is incontestable. It helps to reveal not only intriguing differences between countries and cultures, but also aspects of one’s own country and culture that would be difficult or impossible to detect from domestic data alone. As Durkheim famously put it: “Comparative sociology is not a particular branch of sociology: it is sociology itself” (Durkheim, 1964, pp.139). Even so, the strict methodological standards that have long been employed in many national studies have tended to be beyond the reach of many comparative studies (Scheuch, 1966; Teune, 1992). One obvious reason is their expense. But there are other even more compelling reasons, notably that comparative studies have to deal with competing cultural norms and national methodological preferences that single-nation studies do not begin to face. Although these problems are not necessarily insuperable, it seems that national customs and conventions have too often held sway over methodological consistency. As a result, design inconsistencies that would never be tolerated in important national studies have frequently been shrugged off in important comparative studies. Only after the event have the
∗
Roger Jowell is a Research Professor at City University London and Principal Investigator of the European Social Survey (ESS); Max Kaase is an emeritus Professor of Political Science at the University of Mannheim, past President of the International Political Science Association, and chair of the ESS Scientific Advisory Board; Rory Fitzgerald is a Senior Research Fellow at City University and a member of the ESS Central Coordinating Team (CCT); Gillian Eva is a Research Fellow at City University and a member of the ESS CCT.
Jowell-Chapter-01.qxd
2
3/9/2007
8:03 PM
Page 2
MEASURING ATTITUDES CROSS-NATIONALLY
methods of several celebrated comparative studies been shown to be less consistent between nations than they ought to be (see Verba, 1971; Saris and Kaase, 1997; Park and Jowell, 1997). This was the situation that confronted the team responsible for the ‘Beliefs In Government’ project which started in 1989, sponsored by the European Science Foundation (ESF) and led by Max Kaase and Ken Newton (1995). The project was designed to compile and interpret existing data about changes over time in the socio-political orientations of European citizens in different countries. Many sources of data were available to the study – notably time series such as the Eurobarometers, the International Social Survey Programme, the European (and World) Value Surveys, and sets of national election studies. But although these studies formed the essential source material for the study, the scope for rigorous comparative analysis across countries and over time was limited by their discontinuities and internal inconsistencies. This discovery was the inspiration behind the European Social Survey. A member of the ESF Standing Committee of the Social Sciences (SCSS) at the time, Max Kaase proposed to his colleagues a project to investigate the feasibility of starting a new European Social Survey with a view to mitigating the limitations that the Beliefs in Government project had revealed. The SCSS agreed and set up an eight-person ‘Expert Group’ to pursue the idea (see Note 1 at the end of this chapter). At the end of its year-long deliberations, it concluded that a new rigorous and meticulously planned panEuropean general social survey was both desirable and feasible (ESF, 1996). As importantly, it concluded that, with the aid of the ESF and its member organisations throughout Europe (plus, it was hoped, the European Commission – EC), the project was likely to be fundable. Thus encouraged, the SCSS set up and financed two new committees: the first – a Steering Group (see Note 2 at the end of this chapter) – representing social scientists selected by each of the ESF’s interested member organisations; and the second – a Methodological Committee (see Note 3 at the end of this chapter) – consisting of a smaller number of specialists from a range of European countries. These two groups were jointly charged with turning the idea into a well-honed blueprint for potential action. After parallel deliberations, though with some overlaps in membership, the chairs of the two committees (Kaase and Jowell), together with the SCSS scientific secretary (John Smith), jointly produced a Blueprint document (ESF, 1999), which was duly presented to and endorsed by the SCSS and distributed to all ESF member organisations. Here at last was a document which contained not only a call for regular, rigorous monitoring of changes in values within modern Europe, but also a detailed specification of how such a highly ambitious project might be set up and implemented in an equivalent way across a diverse range of European countries. The Blueprint also made
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 3
The European Social Survey as a measurement model
3
clear that the project could not be a one-shot comparative survey. To achieve its essential aim of monitoring and interpreting change, it had to undertake repeat measurements over an extended period. The Blueprint was soon welcomed by many academics in the field throughout and beyond Europe, but also – and more importantly perhaps – by the many national social science funding agencies that, as ESF members, might be called on to contribute resources to such a project. The proposal had its detractors too, most of whom saw the potential value of the project but believed it might be too ambitious and expensive to get off the ground. As the remainder of this book shows, these fears fortunately proved to be unfounded. Following publication of the Blueprint, a small team led by Roger Jowell was assembled (see Note 4 at the end of this chapter) to formulate an application to the EC for core funding of the project that would cover the ESS’s detailed design and continuing coordination, but not its fieldwork – which was always to be financed at a national level. Meanwhile, the ESF had begun seeking commitments from its member organisations that – if EC funding was in the event to materialise for the ESS core activities – they would in turn be ready to meet the costs of their own national fieldwork and domestic coordination. Learning from the experience of other studies, however, no potential funding agency was left in any doubt that the hallmark of the ESS was to be consistency across nations and exacting standards. Thus, familiar but inappropriate national variations in methodology were in this case to be firmly resisted. Rather, the design was to be based on the now publicly available Blueprint and determined by a Central Coordinating Team. Although there would, of course, be consultation with all participants and advisers – the ESS was above all to be implemented according to a uniform (or equivalent) set of principles and procedures. Given the fact that many of the potential participating countries would have to go through complicated funding hoops to secure support for this new venture, the core application to the Commission cautiously assumed that around nine nations would participate in the first round. Others, it was hoped, would follow suit in subsequent rounds. As it turned out, however, not long after the successful outcome of the EC application had been announced, an astonishing 22 countries had opted to join the ESS’s first biennial round in 2002/2003, each funding its own share of the study’s costs. All but one of those same nations then also took part – again on a selffunding basis - in the second round in 2004/2005 and were joined by five new nations. Now almost all of these nations are participating in the third round in 2006/2007, again with some important new entrants. Critically, at each new round the EC has also supported applications from the central coordinating team to cover the project’s continuing design and coordination.
Jowell-Chapter-01.qxd
4
3/9/2007
8:03 PM
Page 4
MEASURING ATTITUDES CROSS-NATIONALLY
Apart from its unusual rigour for a comparative attitudinal survey, two further features of the ESS attracted immediate and widespread interest among social scientists. The first was the division of the ESS questionnaire into two halves – one half devoted to its core measures and the other half to two rotating modules, both subject to a Europe-wide competition among multinational teams of social scientists. This arrangement ensures on the one hand that there is appropriate continuity between rounds, but on the other that the central team is not the sole arbiter of the study’s content. It also means that many academics in many countries look to the ESS as a potential vehicle for the collection of valuable multinational data in their field. The second feature of the ESS that has ensured immediate attention is its firm policy of transparency and open access. All its protocols and methods are made immediately available on the ESS website (www.europeansocialsurvey.org), and each round of data is also made immediately available on the ESS data website (http://ess.nsd.uib.no), giving everyone simultaneous access and allowing no privileged prior access to the principal investigators. Perhaps it was these features of the ESS that so swiftly alerted social scientists to its existence, particularly those throughout the world who are involved in comparative social measurement. But the interest in the project seemed to expand exponentially when it was announced in 2005 that the ESS team had won the coveted Descartes Prize “for excellence in collaborative scientific research”. As the first social science project ever even to have been short-listed for this top European science prize, it was a welcome sign that the project had met the approval of the wider scientific community in Europe. Before dealing with the specific components of the ESS model, we wish briefly to rehearse some of the broader motivations behind the enterprise. In defence of rigour Good science – whether natural science or social science – should never turn a blind eye to its known imperfections. Nor should those imperfections be concealed from potential users. Some might argue that the social sciences are always an order of magnitude more error-prone than are the natural sciences. That is disputable, but in any case it provides all the more reason for greater rather than less vigilance in social science methodology. In some respects too, the social sciences are even more complicated than the natural sciences. Although they do not have to explain the complexities of the physical and natural world, they do have to interpret and explain the complexities of people’s interactions – whether with one another or with their world. And human interactions are in some ways more complicated than are interactions in the physical and natural world. For one thing, ‘laws of behaviour’ are less in evidence among human populations than among,
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 5
The European Social Survey as a measurement model
5
say, physical objects, or chemicals, or even creatures. Thus, social scientists cannot as confidently make assumptions about the likely regularities of human interactions as, say, chemists sometimes can about the interactions between certain gases. Not only do cultural variations complicate the measurement of human behaviour and attitudes across nations, but so perhaps do larger and more unpredictable individual variations within the same populations. Moreover, human beings have their own value systems and are ‘opinionated’ in ways that their counterparts in the natural world are not. They are also all too capable of believing one thing and doing (or saying) quite another. So, the social sciences often have to start off by overcoming barriers which are erected (whether deliberately or intuitively) by the objects of their measurements themselves. Unless they succeed, these barriers may distort or nullify their findings. All of which makes the general domain of the social scientist particularly tricky. But, as in all fields, some aspects are a great deal trickier than others. Three features of the ESS (and other similar studies) place it near the extreme of this notional spectrum of difficulty: • Measuring social attitudes and values is for many reasons more risky and error-prone than measuring validatable facts and behaviour patterns, because they tend to be even more fluid and contextdependent. • Measuring change over time adds a level of complexity to the analysis and interpretation of findings that rarely applies to studies that are able to rely on one-off measurements. • Measuring cross-national differences and similarities is made infinitely more difficult by simultaneous variations in social structure, legal systems, language, politics, economics and culture that would be rare indeed in a single-country study. Cross-national studies of attitude change simultaneously incorporate all three of these daunting aspects of quantitative social measurement. But the ESS was fortunate in coming late to the scene, by which time many distinguished comparative studies had already laid the groundwork, such as Almond and Verba (1963), Barnes et al, (1979) and, more recently, a series of comparative surveys of attitude and value change, including the Eurobarometers, the International Social Survey Programme and the European (and World) Value Surveys. The ESS was determined not only to learn from these studies, but also, wherever possible, to mitigate the methodological difficulties they had encountered, just as other present and future projects will doubtless build on the ESS model.
Jowell-Chapter-01.qxd
6
3/9/2007
8:03 PM
Page 6
MEASURING ATTITUDES CROSS-NATIONALLY
The initiators of the ESS also found themselves with an enviable remit. Their role was not just to determine the structure and style of a new improved time series on European attitude change, but to do so without compromising the highest standards of scientific rigour. The enthusiastic and widespread support they received for this goal was as surprising as it was inspiring. It came not just from individual members of numerous specialist advisory groups, but also from the officials (and ultimately the referees) who deal with EC Framework Programmes, as well as from a range of funding councils throughout Europe (well beyond the borders of the EU itself). The time was clearly ripe for a brave new initiative which would not only monitor value change in a changing Europe according to the highest technical standards, but also meticulously (and openly) document the process for the benefit of others in the field. At last rigour, as opposed to speed and cost alone, was firmly back on the agenda. The pursuit of equivalence All quantitative research depends for its reliability on what may be called a “principle of equivalence” (Jowell, 1998). For instance, even in national surveys the probability of an individual citizen’s selection in a sample should be equal (or at least known and non-zero) to satisfy the demands of representativeness. Similarly, co-operation or response rates should not vary greatly between different subgroups within a nation if the pursuit of equal representation is to be sustained. Questions should have a broadly equivalent meaning to all respondents to ensure that variations in the data derive from differences in their answers rather than in their interpretation of the questions. Coding schemas must be devised to ensure that it is the codes rather than the coders that account for differences in the distribution of answers. And so on. A great deal of work in national surveys therefore goes into the sheer process of ensuring that different voices in the population are appropriately represented and taken equally into consideration. Only to the extent that a national survey succeeds in that objective are its findings likely to approximate to some sort of social reality. But to the extent that these problems of achieving equivalence affect national surveys – since no nation is homogeneous with respect to vocabulary, first-language, modes of expression, levels of education, and so on – they are, of course, greatly magnified when it comes to multi-national surveys. For a range of welldocumented reasons, most comparative surveys have not entirely succeeded in coming to grips with them. Cultural, technical, organisational and financial barriers have undermined equivalence in comparative studies for at least three decades – from the ‘courtesy bias’ first discovered in South East Asian studies (Jones, 1963), to the recognition that ‘spurious lexical equivalence’ often disguises major differences in meaning (Deutscher, 1968; Rokkan, 1968;
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 7
The European Social Survey as a measurement model
7
Cseh-Szombathy, 1985). Hantrais and Ager (1985) have argued for more effective cooperation between linguists and social scientists, but – to the extent that this has happened at all – it has not improved things markedly. The fact remains that different languages are not necessarily equivalent means of defining and communicating the same ideas and concepts; they are also reflections of different thought processes, institutional frameworks and underlying values (Lisle, 1985; Harding, 1996; Harkness, 2003). From the start, comparative researchers were also frustrated by country-specific differences in methodological and procedural habits – such as in their preferred modes of interviewing, their deeply ingrained preferences for different sampling models and procedures, major differences in how they defined ‘acceptable’ response rates, the different ways in which they employed visual aids, variations in their training of interviewers and coders, and their often tailor-made socio-demographic classifications (see, for instance, Mitchell, 1965). Comparative social scientists also soon discovered that certain ‘standard’ conceptualisations of cleavages within one country (such as the left–right continuum, or the liberal–conservative one) had no direct counterpart in another, and that seemingly identical questions about concepts such as strong leadership or strong government, or nationalism or religiosity, tended to be interpreted quite differently in different countries according to their different cultural, social structural and political conditions (Miller et al, 1981; Scherpenzeel and Saris, 1997; Saris and Kaase, 1997). Many impressive attempts have been made to mitigate these problems, but with patchy results. For instance, having heeded the problems faced by predecessor’s comparative studies, the International Social Survey Programme (ISSP) started off with strict standardisation in mind (Davis and Jowell, 1989). Although the ISSP did in fact make large strides towards consistency, it was thwarted by an absence of any available central coordinating budget with which to help enhance its equivalence across nations. Each of the (now) 39 national institutions in the ISSP has to find its own annual funds to carry out the survey and although they all ‘agree’ to follow the project’s clearly laid-out methods and procedures, some of them have found themselves unable to comply without stretching the meaning of concepts such as ‘probability sampling’ or ‘no substitution of refusals’. Moreover, unlike the ESS which has the resources to identify such problems in advance and to monitor the implementation of agreed standards, embarrassing variations in the ISSP were discovered only after the event. And despite the heroic efforts by the ISSP secretariat to remedy these problems in subsequent rounds of the survey, some have proved difficult to shift. These experiences confirmed to the architects of the ESS that, in the absence of appropriate budgetary or executive sway, too many participants in multi-national surveys will inevitably take decisions into their own hands with potentially serious consequences for equivalence and reliability.
Jowell-Chapter-01.qxd
8
3/9/2007
8:03 PM
Page 8
MEASURING ATTITUDES CROSS-NATIONALLY
One key aspect of the ESF Blueprint was to prove critical in mitigating this problem. A two-pronged approach was devised to help ensure compliance to the ESS’s centrally–determined specification. In the first place, the ever-present Central Coordinating Team is responsible for designing, specifying and monitoring the use of equivalent methods in all nations. Equally, all national funding organisations make their own separate commitments (via the ESF) that they too will ensure compliance on behalf of their selected national teams. It is probably this dual arrangement, above all, that sustains the extent of methodological equivalence which has come to define the ESS. Inevitably, however, plenty of national deviations still manage to arise. True, most but not all are minor, and most but not all are inadvertent. But in keeping with the project’s spirit of transparency, all such deviations are identified and published at the conclusion of each round of the survey. This practice is by no means designed to ‘name and shame’ those responsible for the deviations. It has two quite different motives. First, it shows to all participants what can go wrong with a view to preventing similar breaches in future rounds; and secondly, potential users of the data have a right to have early knowledge of such deviations in case it affects their analyses, or even their choice of which nations to include in their comparisons. There is, of course, an almost endless list of potential hazards that can crop up in one corner or another of a large cross-national study – from subtle translation discrepancies to uncharted sampling differences, from esoteric variations in coding conventions to differential context effects, from major response rate variations to more straightforward transcription errors, from variations in ‘standard’ definitions to mundane timetable slippages, and so on. All these hazards can be reduced to a greater or lesser extent, but they cannot, of course, ever be eliminated. All the ESS protocols, which are published on its website, go into meticulous detail to help ensure that these risks are minimised. Practical steps are also taken, such as setting up a standing sampling panel, a methods group and a translation panel to give detailed help on a range of technical issues. As with all multi-national studies, one of the most difficult tasks is to achieve functionally equivalent translations of questionnaires and other documents. In the case of the ESS, the Blueprint argued for English as the project’s official language – for its meetings as well as all its central documentation. This proposal prevailed. Thus, all original ESS protocols, questionnaires and field materials are formulated in English and subsequently translated by national teams as necessary into their own languages (well over 20 in all) – see chapter 4. Although this practice has a strong whiff of hegemony about it, it is nonetheless a massive administrative convenience for a unified project such as the ESS. But it also has its hazards because certain English phrases (and especially idioms) have no equivalent counterpart in many other languages. On balance, however, operating in a single widely spoken language is surely preferable to the potentially chaotic alternative. And we are fortunate in having
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 9
The European Social Survey as a measurement model
9
the help of a group of admirably bilingual National Coordinators and their colleagues to prevent the most obvious errors. We stress these issues to illustrate the numerous inherent obstacles to equivalence that a multi-national survey covering such a large number of heterogeneous countries inevitably faces. Issues of taxonomy, technique, human error, lapses in communication, cultural and political circumstances, and a host of other factors all get in the way of equivalence to a greater or lesser extent. And these difficulties increase with the number and heterogeneity of the countries involved. Nonetheless, we should not exaggerate the rigidity with which the ESS pursues absolute methodological consistency come what may. Its goal is to achieve equivalent methods and measures, not identical ones. It would, for instance, be wholly unrealistic to require all countries to use precisely the same sampling procedures. Some countries – notably the Nordic countries – have publicly available registers of all individuals which contain details of their demographic and economic characteristics of a sort that would infringe the privacy laws of other countries. Alas, most countries do not constitute such a ‘sampling heaven’, and some have no reliable publicly available list of individuals or addresses at all. To select equivalent probability samples in these very different circumstances necessitates different approaches to the same end. So although the ESS specifications do rigidly require each national sample to be based on random (probability) methods designed to give every resident of that country (not just citizens) an equal (non-zero) chance of selection, each country has to achieve that overall objective taking due account of its particular set of opportunities and obstacles. Working closely with the central sampling panel, this process may involve quite a bit of to-ing and fro-ing before an optimal solution is reached, but in no case has the goal of sampling equivalence been breached (see chapter 2). The ESS model in practice The ESS’s three main aims are: • to produce rigorous data about trends over time in people’s underlying values within and between European nations • to rectify longstanding deficits in the rigour and equivalence of comparative quantitative research, especially in attitude studies • to develop and gain acceptance for social indicators, including attitudinal measures, that are able to stand alongside the more familiar economic indicators of societal progress.
Jowell-Chapter-01.qxd
10
3/9/2007
8:03 PM
Page 10
MEASURING ATTITUDES CROSS-NATIONALLY
If we were ever remotely to fulfil these aims, we required not only a wellformulated model, as provided by the Blueprint document, but also a detailed modus operandi that was demonstrably capable of delivering that model on the ground. This issue loomed large in the initial application to the European Commission for Round 1 funding, submitted in June 2000, which – we reasoned – was not aimed solely at the Commission but also at the many national funding agencies that might soon be called on to fund their own fieldwork and national coordination for the first round. Our plans thus had to stand up to the detailed scrutiny not only of the European Commission’s officers and referees, but also of more than 20 separate national funders. The plans also had to persuade the wider academic community from among whom National Coordinators would subsequently be appointed that it was not only doable but worth doing. And they had to be acceptable to the various national field agencies that would ultimately be asked to implement the plans on the ground. In summary, our initial task was to persuade an unusually large number of knowledgeable and habitually sceptical observers that the ESS was capable of becoming an especially authoritative and influential study, both substantively and methodologically. It is clearly a long journey from the starting point of even a splendid design to its simultaneous implementation in over 20 countries. In this chapter we briefly summarise not only the range of design characteristics and innovations that we believe have been critical to the success of the ESS, but also the set of structural arrangements that have contributed most to their implementation. Subsequent chapters deal in more detail with many of these topics. But we should re-emphasise emphasise that the detailed design specification for the ESS is not followed in all cases with quite the same precision as it is in others. As noted, some of the inherent difficulties of cross-national studies have proved extremely difficult to solve, and there have been errors of omission and commission en route. The deviations that have occured are discussed later in this chapter. Thankfully, however, the compliance rate on most of the ESS’s demanding list of requirements is impressive. And for this, a great deal of credit goes to the National Coordinators. So, as noted, we believe we have achieved more than expected in terms of sampling equivalence between countries. But in aspects of fieldwork, we still have some way to go. Granted that face-to-face interviewing is universally applied in the ESS, as are many other key fieldwork requirements, but the reality is that fieldwork organisations tend to have their own preferred procedures, which even the most well-monitored survey cannot easily influence. For instance, although we specify a maximum number of
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 11
11
The European Social Survey as a measurement model
respondents per interviewer in order to reduce the impact of interviewer variability on the results, this requirement is often unilaterally abandoned (perhaps appropriately) when it is seen to conflict with the achievement of high response rates. The same reasoning sometimes applies to the stretching of fieldwork deadlines, resulting in a wider than hoped for range of national fieldwork periods. Continuity Any multinational time series such as the ESS depends above all not just on a consistent methodology but also on continuity of participation by the nations involved and, of course, on uninterrupted funding. Although in these respects the ESS has been particularly fortunate so far, it has not yet achieved any real security. Instead, it still has to rely on the round by round success of applications for funding both of its coordination and of each country’s participation in the enterprise. So every biennial round of the ESS involves over 25 independent funding decisions – each of which, if negative, could inflict damage on the project as a whole. We hope this may change in EC Framework 7, but we will have to wait and see. Meanwhile, the continuity of national participation and funding throughout the first three rounds of the ESS has admittedly been remarkably smooth. Table 1.1 shows the pattern of national participation over the first three biennial rounds of the ESS by the 32 countries that have funded and fielded at least one round. Table 1.1
The 32 ESS participating countries to date
Country
R1
R2
R3
Country
Austria Belgium Bulgaria Cyprus Czech Republic Denmark Estonia Finland France Germany Greece Hungary Iceland Ireland Israel Italy
Latvia Luxembourg Netherlands Norway Poland Portugal Romania Russia Slovakia Slovenia Spain Sweden Switzerland Turkey UK Ukraine
? ?
R1
R2
R3 ?
Notes: Number of countries in Round 1: 22; number of countries in Round 2: 26; number of countries in Round 3: 25–28
Jowell-Chapter-01.qxd
3/9/2007
12
8:03 PM
Page 12
MEASURING ATTITUDES CROSS-NATIONALLY
In sum, 18 European countries may be described as perennial ESS participants, having taken part in all three rounds to date.1 Four further countries who joined at Round 2 are also participating in Round 3.2 Five further Round 3 joiners will, we hope, sustain their participation into future rounds. And the five remaining participants that failed to obtain funding for Round 2 and/or Round 3 are all determined to remain in the fold and to rectify their funding gap in Round 4 and beyond. So although results suggest that we ought perhaps to be confident about the longer-term stability of the ESS, the persistence of the present funding regime – with its multiplicity of independent decision trees – is simply not conducive to a strong sense of security. On a more positive note, some countries have recently managed to secure a longer-term commitment to ESS participation (usually up to two rounds ahead), on condition that the EC’s core-funding of the project – itself subject to a round by round competition – continues to flow. We are delighted to report that an early decision by the Commission to core-fund ESS Round 4 (in 2008/2009) has recently been secured. The continuity of funding and national participation that the project has enjoyed so far has undoubtedly been a key factor in attracting analysts to its dataset. Not only does the relatively stable range of countries within each round enable cross-national comparisons to be validated, but the repeated participation of over 20 countries enables all-important analyses to be made of changes within and between nations. Governance The origins of the unusual governance arrangements of the ESS may be found in its initial Blueprint, though they have been adjusted as necessary to fit the circumstances of a larger and potentially more cumbersome enterprise than had been envisaged. Figure 1.1 summarises the overall organisational structure of the ESS. At the heart of the governance arrangements are the six institutions listed at the centre and centre-left of Figure 1.1. They constitute the ESS Central Coordinating Team (CCT) (see Note 4 at the end of this chapter), which collectively holds the various grants for the project and takes overall responsibility for the programme of work (see ‘Division of tasks’ in this chapter). But the successful execution of the project at a national level relies equally on the country teams on the right of the Figure (National Coordinators and 1
Italy is included in this figure though their funding for Round 3 is still uncertain. Iceland and Turkey are included in this figure though their funding for Round 3 is still uncertain. 2
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 13
The European Social Survey as a measurement model
Figure 1.1
13
ESS organisational structure
survey institutes) which ensure that the project is faithfully adapted, translated and carried out to the same exacting standards in all nations. The four bodies at or near the top of Figure 1.1 collectively ensure that the project adheres to or exceeds its ambitious ideals. Chaired by Max Kaase, the Scientific Advisory Board (see Note 5 at the end of this chapter) meets twice a year and has been remarkably stable in its membership. Board members are eminent social scientists from all ESS participating countries, each nominated by their main academic Funding Council. Individually and collectively, they help to steer the ESS in virtuous directions, influencing its key decisions. Moreover, the Board also plays the sole executive role in the selection of specialist Question Module Design Teams, the bodies which help to design one half of the questionnaire at each round. The Funders’ Forum (see Note 6 at the end of this chapter) consists of senior staff members from each of the national funding bodies (plus the EC and the ESF). It meets less frequently – usually about once a year – and its key role is to monitor the progress of the project and, in particular, its role as a large, long-term multinational investment. It attempts to foresee and prevent unintended funding discontinuities.
Jowell-Chapter-01.qxd
14
3/9/2007
8:03 PM
Page 14
MEASURING ATTITUDES CROSS-NATIONALLY
The smaller Methods Group (see Note 7 at the end of this chapter) is chaired by Denise Lievesley and consists of four other eminent survey methodologists. It also meets about once a year to tackle the knotty technical and statistical issues that a project of this size and complexity inevitably throws up. They respond admirably to the numerous technical conundrums that are put to them, guiding the CCT towards appropriate solutions. And they advise on the ESS’s methodological programme, injecting new ideas and helping to produce elegant solutions. As noted, new Question Module Design Teams (see Note 8 at the end of this chapter) are selected at each round to help formulate the rotating elements of the questionnaire, which form nearly one half of its content. This procedure is designed to ensure that the ESS’s content is determined not only by the need for continuity but also by a dynamic ‘bottom-up’ process. An advertisement is placed in the ‘Official Journal’ well before each round starts and it is publicised through National Coordinators within their own countries. It invites multi-national teams of scholars to apply for the chance to help design a (now) 50-item module of questions on a subject of their choosing for the following round of the survey. In general, two such teams are selected by the project’s Scientific Advisory Board, having considered the suitability of the subject and the experience of the prospective team. The successful teams then work closely with the CCT to develop suitable rotating modules for pilot and subsequent fielding in the next round of fieldwork (refer to the Questionnaire section of the ESS website). Seven rotating modules have been fielded to date in one or other of Rounds 1 to 3, and their data are widely quarried by analysts (see the description of Workpackages 8 and 9 later in this chapter). There were concerns at the start about whether this procedure for designing rotating modules would work. But thanks largely to the quality of the teams selected at each round, and to the astute comments and suggestions we receive from National Coordinators, it has worked very well, extending both the depth and breadth of the project as a whole. As far as the National Coordinators and survey institutes (see Note 9 at the end of this chapter) are concerned, we are fortunate in having a skilled and committed body of people and organisations who are in all cases appointed and financed by their national academic funding agencies. They collectively represent the leading edge of social survey research practice in Europe. Although their official role is country-specific, they also lend considerable expertise to the project as a whole through a series of National Coordinator meetings and regular email and telephone contact. Their task above all is to ensure that what happens on the ground in their country matches as closely as possible the requirements and expectations of the ESS specification – whether in respect of sampling, translation, fieldwork or coding. As the essential link between the CCT at the
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 15
The European Social Survey as a measurement model
15
centre and what happens in each nation, they take legitimate credit for bolstering the consistent standards to which the ESS tries to adhere. Division of tasks In common with most Commission-funded projects, the ESS work programme is divided in advance into distinct but overlapping ‘workpackages’, each the responsibility of one or more of the CCT institutions. The 11 workpackages are:
Workpackages 1 and 2: Overall project design and coordination The City University team in London3 is contractually responsible for the design and subsequent delivery of the whole programme of work according to budget and timetable, for initiating and convening team meetings, and for liaison with funders, advisers, National Coordinators and the wider social science community. Although CCT meetings are regular events, most of the coordination and communication naturally takes place outside these meetings. So City acts as the hub of the project and is at the centre of communication and discourse with CCT members, national teams, the project’s many influential advisers, the growing number of scholars in the wider social science community who have an interest in ESS methods and outputs, the project’s core funders (the EC and the ESF), and the many national funding bodies that collectively supply the bulk of the overall budget for the project. The City team is also responsible for framing the ‘Specification for Participating Countries’, updated at every round, which lays out in meticulous detail the procedures, standards and outputs required for each aspect of the survey’s implementation (see Project Specification section of the ESS website). But City also has the lead role in questionnaire design at each round of the ESS. While the core questionnaire – which accounts for about one half of the total interview duration–remains as stable as possible from round to round, it is nonetheless continually under review by both the CCT and the Scientific Advisory Board. Limited changes have been introduced at each round, some 3 Roger Jowell, PI and ESS Coordinator; Rory Fitzgerald; Caroline Roberts; Gillian Eva and Mary Keane. Recent additions to the City team are Daniella Hawkins, Eric Harrison, Sally Widdop and Lynda Sones. In addition, Rounds 1 and 2 would never have got off the ground so smoothly and efficiently in the absence of three former members of staff – Caroline Bryson, Ruth O’Shea and Natalie Aye Maung.
Jowell-Chapter-01.qxd
16
3/9/2007
8:03 PM
Page 16
MEASURING ATTITUDES CROSS-NATIONALLY
to remove or amend demonstrably ‘bad’ items, others to introduce new items on emerging issues. But the very purpose of the core – to measure long-term value changes – requires that we should avoid being fidgety with its content. The main round by round task of the City team in respect of questionnaire design is to work closely with the respective Question Module Design Teams (QDTs) on the shape and content of the rotating modules for each round – a protracted process involving face-to-face meetings, several drafts of questions, and two pilot studies (in separate countries) to iron out problems. Only after a detailed analysis of the pilot studies, followed by extensive consultations with the QDTs and National Coordinators, is the module eventually ‘put to bed’ and sent out for translation into multiple languages. The whole questionnaire design process, including its various interim ‘conclusions’, is documented as it takes place and made available on the web immediately so that National Coordinators and others can join the discussions ‘in real time’ and have their say.
Workpackage 3: Sampling The Sampling Panel (see Note 10 at the end of this chapter) is convened by Sabine Häder at ZUMA4 and has three other specialist members. The ESS has an unusual and innovative sampling specification which requires among other things each country to aim for the same ‘effective sample size’, not necessarily the same nominal sample size (see chapter 2). So it is not just the anticipated response rate that a National Coordinator and the Sampling Panel have to take into account in determining the starting number of individuals (or addresses) to select, but also the ‘design effects’ that their chosen design will generate – a function of its extent of clustering. The greater the degree of clustering in the sample design, so the larger must be the starting sample size. It is the Sampling Panel’s role to ensure that these ‘rules’ are closely adhered to. To help achieve this, the Panel allocates each of its individual members to work with a particular set of countries, ensuring that each country has a single named adviser to consult with as necessary. Where the situation requires it, this adviser will travel to the country concerned to investigate possibilities and help find solutions. In any event, each national sample design has in the end to be ‘signed off’ by the Sampling Panel before it is adopted and implemented. We are confident that by these means the ESS achieves equivalent random samples of an unusually high standard. Each national sample is designed to be a probability sample of all residents in that country (not just of its citizens) who are 4
The ZUMA team as a whole consists of Peter Mohler, Janet Harkness, Sabine Häder, Achim Koch and Sigfried Gabler. Recent additions to the team are Annelies Blom, Matthias Ganninger and Dorothée Behr.
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 17
The European Social Survey as a measurement model
17
aged 15 and over (with no upper age limit). For full details of definitions and precise sampling procedures, see the Methodology section of the ESS website.
Workpackage 4: Translation The Translation Taskforce (see Note 11 at the end of this chapter), chaired by Janet Harkness at ZUMA, exists for similar reasons but is not in a position to sign off every translation in every country. As noted, the questionnaire is drafted in English and is then translated not only into each country’s majority language(s) but into every language spoken in that country as a first language by more than five per cent of the population. So several countries – not just the usual suspects such as Switzerland and Belgium – have to translate the source questionnaire into more than one language. The role of the Translation Taskforce is to design, implement and continually refine the protocols and procedures for ensuring equivalent translations, as well as to advise and guide National Coordinators on problems as they arise (see chapter 4). To facilitate the work of the translators, reviewers and adjudicators who are assembled in each country for the purpose of turning all source questions into their own language(s) without changing their meaning, all identifiably ambiguous words and phrases in the source questionnaire are ‘annotated’ in advance with a brief description of their underlying meaning. For instance, one of the batteries of questions in the ESS is designed to measure political tolerance and asks respondents how much they agree or disagree with a series of statements. One of those statements is: “Political parties that wish to overthrow democracy should be banned.” Because of the potential ambiguity of the word ‘democracy’ in this context, the pre-translation source questionnaire contains the following annotation to help translators find an equivalent form in their own language: “ ‘Democracy’ here refers to an entire system or any substantial part of a democratic system, such as the government, the broadcasting service, the courts, etc.” Similarly, the question “How often do you meet socially with friends, relatives or work colleagues?” is accompanied in the pre-translation source questionnaire by the annotation “ ‘Meet socially’ implies meet by choice rather than for reasons of either work or pure duty.” These annotations and many others do not of course appear on the final translated questionnaire, since they are certainly not for the interviewers’ use. Rather they are available solely for translators to help them find the most suitable equivalent phrase in their language. The protocols for translation lay down procedures about what should be done and not done in reaching conclusions on equivalent translations, and how to resolve difficulties. They also give detailed guidance on when and how to use translations that have already been made into the same language
Jowell-Chapter-01.qxd
18
3/9/2007
8:03 PM
Page 18
MEASURING ATTITUDES CROSS-NATIONALLY
by another ESS country (there are many more language overlaps among ESS countries than we had casually anticipated). For full details of the content of the protocols, refer to the Translation Guidelines in the ESS documents part of the ESS website.
Workpackage 5: Commissioning fieldwork Although the selection of fieldwork agencies in each country is the responsibility of the national funding body together with the National Coordinator, the process is coordinated and documented by Ineke Stoop5 at the Social and Cultural Planning Office in The Netherlands. We decided early on not to entrust the fieldwork to a single multi-national supplier, but rather to leave each country to find its preferred supplier and try to ensure that it adhered to the Specification. (In some countries we anticipated that the preferred supplier would be the National Statistical Institute, and so it proved.) But a hazard of remote management of the sort that characterises multi-national surveys is that the longer the chain of command, the likelier it is to break down. In this context, we had to ensure that the survey houses, not just the National Coordinators, knew in advance of costing of the project precisely what they would be signing up to. So the Specification for Participating Countries is provided to every potential fieldwork supplier as part of their invitation to bid for the contract. As noted, it contains details such as the required sampling procedure and size, the target response rate, the maximum number of sampled people/addresses which may be assigned to any one interviewer, the number of calls required before a sampled address may be abandoned, the maximum acceptable proportion of non-contacts, and much besides. These explicit specifications give field agencies advance knowledge of the size and nature of the task they are committing themselves to, helping them to avoid under-costing and, as a result, an inevitable lowering of standards. As always, the quality of a survey project of this nature depends critically on the quality of its fieldwork. So a great deal rests on the survey houses, which are at one step removed from the central project management. To mitigate this potential problem, the CCT has to work closely with National Coordinators as early as possible in the process, helping to ensure that their communication with field agencies is as clear and comprehensive as possible. Once the contract has been completed, an English summary of it is passed on to the CCT. It must be stressed that none of these measures indicates the least lack of confidence in either the National Coordinators or the fieldwork agencies, both 5 Recent additions to the team are Thomas van Putten, Paul Dekker, Peter Tammes and Jeroen Boelhouwer.
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 19
The European Social Survey as a measurement model
19
of whom do their jobs conscientiously and with consummate skill. They are introduced because – on the basis of the experience of other multi-national surveys – things do go wrong, and trying to correct them post hoc is usually either less effective or plain impossible.
Workpackage 6: Contract adherence The task of monitoring and helping to ensure contract adherence in all aspects of national performance is the responsibility of Achim Koch at ZUMA, recently with the help of Annelies Blom. There is, of course, a fine balance to be struck between policing on the one hand and persuasion on the other. Although the Specification for Participating Countries contains details of the respective responsibilities of National Coordinators, field agencies and the CCT itself, the project ultimately stands or falls according to how closely the specification is adhered to. To assess this, and where necessary to remedy it, close monitoring is essential, together with readily available support when it is required. A series of questionnaires filled in by National Coordinators – on the progress of sampling, translation and fieldwork subcontracting – provides the raw material for progress monitoring. Signs of potential non-compliance to the specification, nearly always inadvertent, may be picked up in the process and rectified before it is too late. Similarly, unanticipated difficulties on the ground, such as late fieldwork or lower than predicted response rates, may be discovered and discussed. Many of these problems cannot, of course, be rectified, but some can and others provide the sort of accumulated intelligence that allows a time series to improve round by round. For instance, from Round 2 onwards, National Coordinators have been providing a projection of the number of interviews expected to be completed per week as a benchmark against which to chart progress within and between countries. Naturally, some national variation is inevitable and even appropriate. Certain deviations from the Specification are necessary to comply with domestic laws, conventions or circumstances. For instance, the standard practice in the ESS of recording brief descriptions of non-respondents’ neighbourhoods is apparently contrary to data protection laws in some countries. So it cannot be pursued. By the same token, fieldwork has been delayed in certain countries because it would have clashed with the build-up to a national election, or simply because of hold-ups in funding. Other deviations come to light only once the data from a country is scrutinised as part of the archiving procedures. In these cases the deviation is flagged both in the end of grant report (Central Coordinating Team, 2004) and in the dataset itself so that data users are aware of it. It may well be that this sort of openness with the project’s shortcomings (rather than ignoring or suppressing them) may
Jowell-Chapter-01.qxd
20
3/9/2007
8:03 PM
Page 20
MEASURING ATTITUDES CROSS-NATIONALLY
make the ESS seem like the most error-prone survey of all time; but so be it. In extreme cases, we not only flag the issue but also remove the offending variable from the combined data file (noting that we have done so), thus preventing inattentive data users from mistakenly treating it as equivalent. An example was a source question containing the word “wealthy”, which was inadvertently translated in one country as “healthy” – with predictable consequences for the findings in that country. In similar vein, the CCT discovered that one country had mistakenly excluded non-citizens from its sampling frame. But this error was able to be rectified in time by interviewing a separate appropriately sized random sample of resident non-citizens and merging the data into the main dataset. As noted, we are able to report with some relief that the number of serious deviations in any round of the ESS so far has been small. And for this a great deal of the credit goes to the National Coordinators. In any case, all countries do indeed employ strict probability sampling methods, all countries do conduct translations for minority languages spoken as a first language by at least five per cent of the population, and all countries do conduct face-to-face interviews (apart from a few agreed experimental treatments by telephone). But overall, compliance is lower when it comes to fieldwork procedures because they are inherently more difficult to influence remotely. Although of course face-to-face interviewing is universal in the ESS, as are most other key fieldwork practices, it is nonetheless the case that many fieldwork organisations have their own preferred procedures which even the most well-monitored survey cannot easily influence. Notably, the timing of fieldwork periods in some countries has run significantly over the deadline, usually reflecting the fragility of their national funding arrangements. Similarly, several countries fail to get even close to the ESS’s specified target response rate of 70 per cent (see chapter 6). Nonetheless, ESS response rates are generally higher or much higher than those achieved in similar social surveys in the same countries, suggesting perhaps that even an unattainable target can in some circumstances be an effective motivator. The conclusion we are able to draw from the balance we have struck between persuasion on the one hand and policing on the other is that both are essential to some extent in a dispersed multinational survey such as the ESS. We have found, for instance, that deviations from standard practice have been most in evidence when central attention to those practices has been least in evidence (such as our failure to monitor the laid-down maximum size of interviewer assignments in Round 1). Our experience also suggests that both top-down and bottom-up improvements are able to be introduced, sometimes even during a particular round, but certainly between rounds. So at the start of each round the CCT draws the attention of National Coordinators to the deviations that occurred in the previous round, alerting
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 21
The European Social Survey as a measurement model
21
them to possible pitfalls and how to avoid them. But individual countries are also encouraged and helped to introduce their own measures to improve standards. Switzerland, for example, has made heroic (and successful) efforts to raise their very poor Round 1 response rates significantly in Round 2, while The Netherlands has made important strides in testing how to motivate both respondents and interviewers, using a series of alternative incentives.
Workpackage 7: Piloting and data quality This work is led by Jaak Billiet at the University of Leuven. In Rounds 1 and 2, he was ably supported by the involvement of two outstanding PhD researchers within the university.6 As noted, having been drafted and re-drafted, the ESS final draft questionnaire then goes through a simultaneous two-country pilot, one of which is in an English-speaking country. The Round 1 pilot took place in the UK and The Netherlands, the Round 2 one in the UK and Poland, and the Round 3 one in Ireland and Poland. The sample size for the pilots is around 400 per country, sufficiently large to test scales and questions. Although their primary purpose is to test the rotating modules at each round, several other questions are included in the pilots either to investigate new issues or as independent variables in the data analysis. Some of the questions also go through a ‘Multitrait Multimethod’ analysis to test alternative versions of the same basic question. The pilot data are thoroughly analysed by both the CCT and the QDTs, after which the questions are re-appraised in the light of the results. The prime motive behind the ESS is to provide for scholars in Europe and beyond a regular and rigorous set of comparative datasets as a basis for measuring and analysing social change. So the achievement of data quality has always been a primary concern. There are, of course, numerous components of data quality – among them a representative sample, well-honed questions, skilled interviewing, harmonised coding, and many others. But the University of Leuven’s main focus has been on how to measure and mitigate the potentially damaging effects of different response rates in different countries. This work takes place against the background of falling response rates over the years – a trend that in many cases has simply been accepted as a sign of the times and as yet another instance of the democratic deficit at work. But inertia on the part of social scientists in relation to issues as serious as this would slowly undermine the reliability of quantitative social measurement.
6
Michel Phillippens and Stefaan Pleysier. Other members of the ESS team at the University of Leuven include Silke Devacht, Geert Loosveldt and Martine Parton.
Jowell-Chapter-01.qxd
22
3/9/2007
8:03 PM
Page 22
MEASURING ATTITUDES CROSS-NATIONALLY
It was with this in mind that the ESS has from the start set a target response rate for all countries of 70 per cent. We realised, of course, that this target was unlikely to be universally achieved, but we introduced it in the hope that it would nonetheless help to raise the bar significantly. And, on all the available evidence to date, this is precisely what has happened. The most important requirement is, of course, that contracts with survey houses incorporate the means of achieving high response rates, and that they are appropriately budgeted for. These include a minimum duration of fieldwork that allows enough time to find elusive respondents, and call patterns which ensure that unsuccessful visits to addresses are repeated at different times of day and on different days of the week (see chapter 6). But simply setting such standards does not guarantee they will be faithfully followed in all cases. Also necessary is an objective means of monitoring, documenting and analysing what happens on the ground, providing regular checks of the process and – in the longer term – helping to improve it round by round (Lynn, 2003). So for each call at each address throughout the course of fieldwork, interviewers are required to complete a detailed ‘Contact Form’ containing valuable information about the interaction between interviewers and the addresses they visit. It is this unique set of records that informs the Leuven team’s analyses of response rates, refusals and non-contacts, and which guides the CCT (and the wider survey research community) on strategies for arresting and perhaps reversing the downward trend in response rates.
Workpackages 8 and 9: Question reliability and validity Work on establishing the reliability and validity of individual ESS questions has from the start been the responsibility of Willem Saris and Irmtraud Gallhofer, formerly at the University of Amsterdam and now at ESADE Business School at the Universitat Ramon Llull, Barcelona. Discussions about the possible content of the perennial core ESS questions started well in advance of the publication of the Blueprint document. In the end, a consensus emerged that three broad themes should be included in the core: • People’s value orientations (their world views and socio-political standpoints). • People’s cultural/national orientations (sense of attachment to various groups and their feelings towards outgroups). • The underlying social structure of society (socio-economic and sociodemographic characteristics). Within each of these broad areas we identified a larger number of sub-areas and then commissioned academic specialists in each field to prepare a paper based on a literature review recommending what questions in each sub-area they would regard as essential components of the proposed ESS.
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 23
The European Social Survey as a measurement model
23
Not unexpectedly, several of the desired topics turned out to lack appropriate or well-honed existing questions. But the papers nonetheless provided an excellent background for theory-based questionnaire construction. A draft core questionnaire for Round 1 was eventually produced containing all the proposed essential elements, some of which were represented by ‘classic’ questions, others by freshly minted ones. This draft, together with the drafts of the two rotating modules for Round 1, then went through a number of checks and tests before being adopted. They included ‘predictions’ of their measurement qualities based on each question’s basic properties and making use of the Survey Quality Program, or SQP (Saris et al, 2004). A two-nation pilot and its subsequent analysis followed, in which alternative items were tested for reliability and validity using the Multitrait Multimethod technique (Scherpenzeel and Saris, 1997) (see chapter 3). All subsequent rotating modules and new core questions have gone through a similar process before being finalised. Although these measures do not help us to make choices between topics, they certainly help to guide question and scale construction. The body of work will in time also help to identify and, we hope, rectify problems with translation from the English source questionnaire into other languages, as well as enabling differential measurement error between countries to be neutralised. The core questionnaire contains the following broad list of topics: - Trust in institutions - Political engagement - Socio-political values - Social capital, social trust - Moral and social values - Social exclusion
- National, religious, ethnic identities - Well-being and security - Demographic composition - Education and occupation - Financial circumstances - Household circumstances
The rotating modules fielded to date are: Round 1 - Immigration - Citizen engagement and democracy Round 2 - Family, work and well-being - Economic morality - Health and care-seeking Round 3 - Indicators of quality of life - Perceptions of the life course
Jowell-Chapter-01.qxd
24
3/9/2007
8:03 PM
Page 24
MEASURING ATTITUDES CROSS-NATIONALLY
Workpackage 10: Event monitoring A time series that monitors changes in attitudes can certainly not afford to assume that attitudes exist in a vacuum. They change over time in response to events and to a range of other factors. A simplifying assumption sometimes made is that within a particular round of a time series the impact of events is likely to be rather small. But while this may be true enough in a national survey, it is much less likely to be true in a multi-national survey involving a large number of disparate countries. This was the reason that we introduced event monitoring (see chapter 5) into each round of the ESS. Its purpose in essence is to describe and record the primary short-term events that might create turbulence in the trend-lines of different countries. Unless these events are charted, future analysts might find it difficult to explain why certain apparently inexplicable blips had occurred. From the start, this work has been initiated and coordinated by Ineke Stoop at the Social and Cultural Planning office in the Netherlands The 9/11 attack on New York or the Chernobyl nuclear power disaster are examples of events that deeply affected attitudes and perceptions worldwide. But lesser events, such as a rash of serious crimes in a particular country, or even moments of political turbulence (such as a national election) can also have an impact on public opinion at a national level, so they also ought to be monitored and recorded. Because no budget had been allocated to this process, a somewhat rudimentary system of event recording was devised for Round 1, which has since been upgraded and is soon to be upgraded further. So far, however, it has been up to National Coordinators to compile regular reports of any events just prior to or during fieldwork that receive ‘prominent’ attention in the national press (such as repeat front page coverage or sustained coverage in later pages). Relevant events are those that might conceivably have an impact on responses to ESS questions. These events are allocated to fixed categories with keywords, a short description, plus start and end dates. Although now under consideration for an overhaul, the basic system of event recording devised for the early rounds of the project has served its purpose well and has been greatly welcomed by many data users.
Workpackage 11: Data access and aids to analysis The large corpus of activities that falls under this heading is the responsibility of the team at Norwegian Data Services in Bergen.7 7
Bjørn Henrichsen, Knut Kalgraff Skjåk, Kirstine Kolsrud, Hilde Orten, Trond Almendingen, Atle Jastad, Unni Sæther, Ole Voldsæter, Atle Alvheim, Astrid Nilsen, Lars Tore Rydland, Kjetil Thuen and Eirik Andersen.
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 25
The European Social Survey as a measurement model
25
In strict accordance with the original motivation behind the ESS, the combined dataset containing the data from all participating countries is made available free of charge to all as soon as is practically possible to do so (see chapter 7). As noted, neither the CCT members, nor the National Coordinators, nor the QDTs, nor for that matter anyone else is granted prior access to the dataset, except for checking and quality control purposes. So none of the key players in the design, formulation and execution of any round of the ESS has had the all too familiar ‘lead time’ that enables them to quarry the data and prepare publications earlier than others. Instead, the overriding principle of the ESS model is that it should speedily provide a high-quality public-use dataset that is freely and easily accessible on-line to all comers. This policy may help to explain why, in less than three years since the initial (Round 1) dataset was released, the ESS has already acquired more than 10,000 registered users. All that potential users have to do is to enter the data website (http://ess.nsd.uib.no) and provide their name and email address. They are then granted immediate access to the fully documented dataset, along with a considerable amount of metadata. The NESSTAR distribution system enables users to browse on-line and to produce tabulations more or less at the touch of a button. But it also allows them instantly to download all or parts of the dataset in a number of formats for subsequent more complex analyses. More than 50 per cent of the 10,000 registered users have used this download facility. Each fully documented ESS dataset so far has been made publicly available on the web within nine months of each Round’s fieldwork deadline. But the initial releases do not, of course, include the small minority of countries whose fieldwork has run exceptionally late in that round. Their data are in each case merged into the combined dataset later. As Kolsrud and Skjåk (2004) have observed, it is the ESS’s central funding and organisational structure that has enabled it to produce a set of equivalent measures which should be the hallmark of quality in a multi-national dataset. This is largely a product of input harmonisation, including adherence to internationally accredited standards, which is achieved by giving National Coordinators on-line access to the documentation, standards, definitions and other tools required to ensure such adherence. Flaws are thereby minimised. (See the ESS Data Protocol in the Archive and Data section of the ESS website for a comprehensive guide to the required procedures for depositing data and the accompanying documentation.) The data website also contains information about the socio-cultural context in each country. Based on data assembled at a national level, the site contains population statistics on age and gender, education and degree of urbanisation and – in response to requests from Questionnaire Design Teams – specific background statistics that help set the context for their particular modules, such as
Jowell-Chapter-01.qxd
3/9/2007
26
8:03 PM
Page 26
MEASURING ATTITUDES CROSS-NATIONALLY
the racial composition of the population or levels of immigration. NSD also routinely adds data on national elections, GDP and life expectancy and provides links to the SCP-compiled Event Data referred to earlier. A guide to sources of pan-European context data that is compiled by SCP Netherlands is also provided on the site. As well as the organisation, archiving, and provision of access to successive rounds of ESS datasets, NSD is also responsible for ‘EduNet’ (http://essedunet.nsd.uib.no), an on-line training facility that makes use of the ESS datasets to introduce and guide new researchers through a range of data analysis methods. So far, the EduNet facility has utilised the Human Values Scale, a regular part of the ESS core, and the Round 1 rotating module on citizenship, on which chapters 8, 9 and 10 all draw. Conclusion All surveys are judged by the analytical power they provide. On the evidence of the very large body of serious data users that the ESS dataset has already attracted, we hope that the project comfortably passes this test and will continue to do so. But we also hope that the ESS will be judged by the methodological contributions it is making to the design and conduct of large-scale comparative surveys more generally (see chapter 11). The project is after all a product of a bottom-up process from within the European social science community. It was the ESF – representing almost all national academic science funders in Europe – that provided the project’s initiators with the seed money to investigate the possibilities, and then further seed money to turn a seemingly plausible idea into a working reality. The ESF has also steadfastly continued to finance and service the project’s Scientific Advisory Board – which, as noted, is by no means a token institution. And the national academic science funding agencies provide the lion’s share of the total cost of the ESS through their continued financing of domestic coordination and fieldwork. The project is particularly fortunate in being able to benefit not only from the financial support of these bodies, but – as importantly perhaps – from the authority they give to the ESS and its methods. But the ESS has also been very fortunate to enjoy long-term core support and finance from the EC. In the absence of this central funding, the project would have been still-born. Not only has the Commission already agreed to fund four biennial rounds of the survey to date, stretching its central funding from 2001 to at least 2009, but it has also more recently agreed to provide large-scale ‘Infrastructure’ support for the project until 2010. This new form of support for the social sciences is quite different in character from the round-by-round support that the project has enjoyed so far. It is instead
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 27
The European Social Survey as a measurement model
27
designed to encourage more outreach, innovation, training and methodological work, and to extend and enhance the ESS’s existing access provision. Its budget is supporting a modest expansion of the staff in the CCT institutions, the addition of a new partner institution of the CCT – the University of Ljubljana, Slovenia – and a range of new organisational and methodological enhancements. It is a welcome recognition – just as was the award to the ESS of the Descartes Prize in 2005 – that ‘big’ social science has similar characteristics and similar needs to those of ‘big’ science. The real heroes of the ESS, however, are the (so far) three sets of around 35,000 respondents at each round, spread across a continent, who have voluntarily given their time and attention to our endless questions. We owe them a huge debt of gratitude. Notes 1. Members of the Expert Group were: Max Kaase, Chair; Bruno Cautrès; Fredrik Engelstad; Roger Jowell; Lief Nordberg; Antonio Schizzerotto; Henk Stronkhorst; John Smith, Secretary. 2. Members of the Steering Committee were: Max Kaase, Chair; Rune Åberg; Jaak Billiet; Antonio Brandao Moniz; Bruno Cautrès; Nikiforos Diamandouros; Henryk Domanski; Yilmaz Esmer; Peter Farago; Roger Jowell; Stein Kuhnle; Michael Laver; Guido Martinotti; José Ramón Montero; Karl Müller; Leif Nordberg; Niels Ploug; Shalom Schwartz; Ineke Stoop; Françoise Thys-Clement; Niko Tos; Michael Warren; John Smith, Secretary. 3. Members of the Methodology Committee were: Roger Jowell, Chair; Jaak Billiet; Max Kaase; Peter Lynn; Nonna Mayer; Ekkehard Mochmann; José Ramón Montero; Willem Saris; Antonio Schizzerotto; Jan van Deth; Joachim Vogel. 4. The institutional grant-holders and senior members of the Central Coordinating Team (CCT) are: Roger Jowell (City University, London, UK), Principal Investigator; Jaak Billiet (University of Leuven Belgium); Bjorn Henrichsen (NSD, Bergen, Norway); Peter Mohler (ZUMA, Mannheim, Germany); Ineke Stoop (SCP, The Hague, Netherlands); and Willem Saris (University of Amsterdam Netherlands, now at ESADE Business School, Universitat Ramon Llull, Barcelona). Institutional membership has recently expanded to include Brina Malnar at the University of Ljubljana, Slovenia and assisted by Vasja Vehovar, Tina Zupan and Rebeka Falle. 5. Members of the SAB are: Max Kaase, Chair; Austria: Anton Amann; Belgium: Piet Bracke and Pierre Desmarez; Bulgaria: Atanas Atanassov; Cyprus: Kostas Gouliamos; Czech Republic: nomination pending; Denmark: Olli Kangas; Estonia: Dagmar Kutsar; Finland: Matti Heikkilä; France: Bruno Cautrès; Germany: Ursula Hoffmann-Lange; Greece: John Yfantopoulos; Hungary: Gergely Böhm; Iceland: Stefán Ólafsson; Ireland: Seán Ó Riain; Israel: Shalom Schwartz; Italy: Guido Martinotti; Latvia: Aivars Tabuns; Luxembourg: Andrée
Jowell-Chapter-01.qxd
28
6.
7. 8.
9.
3/9/2007
8:03 PM
Page 28
MEASURING ATTITUDES CROSS-NATIONALLY
Helminger; Netherlands: Jacques Thomassen; Norway: Ann-Helén Bay; Poland: Henryk Domanski; Portugal: Manuel Villaverde Cabral and João Ferreira de Almeida; Romania: nomination pending; Russia: Vladimir Magun; Slovak Republic: L’ubomir Falt’an; Slovenia: Niko Tos; Spain: José Ramón Montero; Sweden: Robert Erikson; Switzerland: Peter Farago; Turkey: Yilmaz Esmer; Ukraine: Eugene Golovakha; United Kingdom: Jacqueline Scott; European Commission: Virginia Vitorino and Andrea Schmölzer; European Science Foundation: Henk Stronkhorst and Gün Semin. Funders’ Forum members: Austria: Richard Fuchsbichler; Belgium: Benno Hinnekint and Marie-José Simoen; Bulgaria: Atanas Atanassov; Cyprus: Spyros Spyrou and Antonis Theocharous; Czech Republic: nomination pending; Denmark: Lars Christensen; Estonia: Reesi Lepa; Finland: Helena Vänskä; France: Roxane Silberman; Germany: Manfred Niessen; Greece: John Yfantopoulos; Hungary: Katalin Pigler; Iceland: Fridrik Jónsson; Ireland: Fiona Davis; Israel: Bob Lapidot; Italy: Anna D’Amato; Latvia: Maija Bundule; Luxembourg: Ulrike Kohl; Netherlands: Ron Dekker; Norway: Ingunn Stangeby; Poland: Henryk Domanski and Renata Kuskowska; Portugal: Ligia Amâncio and Olga Dias; Romania: Ioan Dumitrache; Russia: Vladimir Andreenkov; Slovak Republic: Dusan Kovac and Daniela Kruzˇinská; Slovenia: Peter Debeljak and Ida Pracek; Spain: Martin Martinez Ripoll; Sweden: Rune Åberg and Rolf Höijer; Switzerland: Brigitte Arpagaus; Turkey: Bilal Ahmetceoglu; Ukraine: Natalia Pohorilla; United Kingdom: Stephen Struthers; European Commission: Virginia Vitorino; European Science Foundation: Henk Stronkhorst and Gün Semin. Members of the Methods Group are: Denise Lievesley, Chair; Norman Bradburn; Vasja Vehovar; Paolo Garonna; and Lars Lyberg. Round 1 Questionnaire Design Teams and topics: ‘Citizenship, involvement and democracy’: Ken Newton; Hanspeter Kriesi; José Ramón Montero; Sigrid Rossteutscher; Anders Westholm; and ‘Immigration’: Ian Preston; Thomas Bauer; David Card; Christian Dustmann; James Nazroo. Round 2 Questionnaire Design Teams and topics: ‘Family, work and social welfare in Europe’: Robert Erikson; Janne Jonsson; Duncan Gallie; Josef Brüderl; Louis-André Vallet and Helen Russell; ‘Opinions on health and care seeking’: Sjoerd Kooiker; Nicky Britten; Alicja Malgorzata Oltarzewska; Jakob Kragstrup; Ebba Holme Hansen; and ‘Economic morality’: Susanne Karstedt; Stephen Farrall; Alexander Stoyanov; Kai Bussman and Grazyna Skapska. Round 3 Questionnaire Design Teams and topics: ‘Personal and social well-being’: Felicia Huppert; Andrew Clark; Claudia Senik; Joar Vitterso; Bruno Frey; Alois Stutzer; Nic Marks; Johannes Siegrist; ‘The timing of life: the organisation of the life course’: Francesco Billari; Gunhild Hagestad; Aart Liefbroer and Zsolt Spéder. National Coordinators and countries: Austria: Karl Müller; Belgium: Geert Loosveldt and Marc Jacquemain; Bulgaria: Lilia Dimova; Cyprus: Spyros Spyrou and Antonis Theocharous; Czech Republic: Klára Plecitá-Vlachova; Denmark: Torben Fridberg; Estonia: Kairi Talves; Finland: Heikki Ervasti; France: Daniel
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 29
The European Social Survey as a measurement model
29
Boy, Bruno Cautrès and Nicolas Sauger; Germany: Jan van Deth; Greece: Yannis Voulgaris; Hungary: Peter Robert; Iceland: Fridrik Jónsson; Ireland: Susana Ferreira; Israel: Noah Lewin-Epstein; Italy: Sonia Stefanizzi; Luxembourg: Monique Borsenberger and Uwe Warner; Netherlands: Harry Ganzeboom; Norway: Kristen Ringdal; Poland: Pawel Sztabinski; Portugal: Jorge Vala; Romania: Mihaela Vlasceanu and Catalin Augustin Stoica; Russia: Anna Andreenkova; Slovak Republic: Jozef Vyrost; Slovenia: Brina Malnar; Spain: Mariano Torcal; Sweden: Mikael Hjerm; Switzerland: Dominique Joye; Turkey: Yilmaz Esmer; Ukraine: Andrii Gorbachyk; United Kingdom: Alison Park. 10. Sampling Panel members: Sabine Häder; Siegfried Gabler; Seppo Laaksonen; Peter Lynn. 11. Translation Panel members: Janet Harkness; Paul Kussmaul; Beth-Ellen Pennell; Alisu Schoua-Glousberg; Christine Wilson.
References Almond, G. and Verba, S. (1963), The civic culture: political attitudes in five nations, Princeton: Princeton University Press. Barnes, S. and Kaase, M. et al. (1979), Political action: mass participation in five western democracies, Beverly Hills: Sage. Central Coordinating Team, City University (Unpublished, 2004), European Social survey: Round 1: End of Grant Report, July 2004. Cseh-Szombathy, L. (1985), ‘Methodological problems in conducting cross-national research on lifestyles’ in: L. Hantrais, S. Mangen and M. O’Brien (eds), Doing Cross-National Research (Cross-National Research Paper 1), Birmingham: Aston University, pp.55–63. Davis, J. and Jowell, R. (1989), ‘Measuring national differences: An introduction to the International Social Survey Programme (ISSP)’ in: R. Jowell, S. Witherspoon and L. Brooks (eds), British Social Attitudes: Special International Report, Aldershot, UK: Gower, pp.1–13. Deutscher, I. (1968), ‘Asking questions cross-culturally; Some issues of linguistic comparability’ in: H.S. Becker, B. Geer, D. Riesman and R.S. Weiss (eds), Institutions and the Person, Chicago: Aldine, pp. 318–341. Durkheim, E. (1964), The rules of the sociological method, 8th edition, New York: The Free Press. ESF (European Science Foundation), Standing Committee of the Social Sciences, (1996), The European Social Survey: Report of the SCSS Expert Group (April, 1996), Strasbourg: European Science Foundation. ESF (European Science Foundation), Standing Committee of the Social Sciences, (1999). The European Social Survey (ESS) – a Research Instrument for the Social Sciences in Europe: Summary, Strasbourg: ESF. Hantrais, L. and Ager, D. (1985), ‘The language barrier to effective cross- national research’ in: L. Hantrais, S. Mangen and M. O’Brien (eds), Doing Cross-national
Jowell-Chapter-01.qxd
30
3/9/2007
8:03 PM
Page 30
MEASURING ATTITUDES CROSS-NATIONALLY
Research (Cross National Research Paper 1), Birmingham: Aston University, pp.29–40. Harding, A. (1996), ‘Cross-national research and the “new community power”’ in: L. Hantrais, S. Mangen and M. O’Brien (eds), Doing Cross-national Research (Cross National Research Paper 1), Birmingham: Aston University, pp.29–40. Harkness, J.A. (2003), ‘Questionnaire Translation’ in: J.A. Harkness, F. Van de Vijver and Mohler, P. (eds), Cross-Cultural Survey Methods, NJ: Wiley. Jones, E. (1963), ‘The courtesy bias in SE Asian surveys’, International Social Science Journal, 15 (1), pp.70–76. Jowell, R. (1998), ‘How Comparative is Comparative Research?’, American Behavioral Scientist, 42 (2), pp.168–177. Kaase, M and Newton, K. (1995), Beliefs in Government, Oxford: Oxford University Press. Kolsrud, K. and Skjåk, K.K. (2004), ‘Harmonising Background Variables in International Surveys’, Paper presented at the RC33 Sixth International Conference on Social Science Methodology, 16–20 August 2004, Amsterdam. Lisle, E. (1985), ‘Validation in the social sciences by international comparison’ in: L.Hantrais, S. Mangen and M. O’Brien (eds), Doing Cross-national Research (Cross National Research Paper 1), Birmingham: Aston University, pp.11–28. Lynn, P. (2003), ‘Developing quality standards for cross-national survey research: five approaches’, International Journal of Social Research Methodology, 6 (4), pp.323–336. Miller, J., Slomczynski, K. and Schoenberg, R. (1981), ‘Assessing comparability of measurement in cross-national research’, Social Psychology Quarterly, 44 (3), pp.178–191. Mitchell, R.E (1965), ‘Survey materials collected in the developing countries: Sampling measurement and interviewing obstacles to intra- and inter-national comparisons’, International Social Science Journal, 17 (4), pp.665–685. Park, A. and Jowell, R. (1997), Consistencies and Differences in a Cross-National Survey, London: SCPR. Rokkan, S. (1968), Comparative Research Across Cultures and Nations, Paris: Moulton. Saris, W. and Kaase, M. (1997), Eurobarometer: Measurement Instruments for Opinions in Europe, Amsterdam: University of Amsterdam. Saris, W., Van der Veld, W. and Gallhofer, I. (2004), ‘Development and improvement of questionnaires using predictions of reliability and validity’ in: S. Presser, J.Rothgeb, M.Couper, J.Lessler, E.Martin and E.Singer (eds), Questionnaire development, evaluation and testing, New York: Wiley. Scherpenzeel A. and Saris, W.E (1997), ‘The validity and reliability of survey questions: A meta analysis of MTMM studies’, Sociological Methods and Research, 25 (3), pp.341–383. Scheuch, E.K. (1966), ‘The development of comparative research: Towards causal explanations’ in: E. Oyen (ed.), Comparative methodology: Theory and practise in international social research, London: Sage (1990), pp.19–37.
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 31
The European Social Survey as a measurement model
31
Teune, H. (1992), ‘Comparing countries: Lessons learned’ in: E. Oyen (ed.), Comparative methodology: Theory and practise in international social Research, London: Sage (1990), pp.38–62. Verba, S (1971), ‘Cross-National Survey Research: The Problem of Credibility’ in: I. Vallier (ed.), Comparative Methods in Sociology: Essays on Trends and Applications, Berkeley: University of California Press.
Jowell-Chapter-01.qxd
3/9/2007
8:03 PM
Page 32
Jowell-Chapter-02.qxd
2
3/9/2007
8:04 PM
Page 33
How representative can a multi-nation survey be? Sabine Häder & Peter Lynn∗
Introduction It is very important for a multi-nation survey to select an equivalent sample in each country. Lack of equivalence in the samples can undermine the central objective of cross-national comparison. However, this is a big challenge. Not only do available sampling frames vary greatly in their properties from country to country, but so does usual sampling practice. Failure to select equivalent samples has been a common criticism of some multi-nation surveys. One of the most important innovations of the ESS has been the system it has devised and implemented to ensure that equivalent high-quality random samples are selected in each country. In this chapter we explain the principles underlying the ESS approach to sampling, we describe the process used to implement these principles and we discuss the extent to which the ESS approach might usefully be applied to other cross-national surveys. 1
*
Sabine Häder is a Senior Statistician at Zentrum für Umfragen, Methoden und Analysen (ZUMA), Mannheim, Germany. Peter Lynn is Professor of Survey Methodology at the University of Essex, Colchester, UK. Both are members of the European Social Survey sampling panel. 1 We would like to thank Siegfried Gabler and Matthias Ganninger (both Centre for Survey Research and Methodology, Germany) for helping us in the calculation of design weights and the estimation of design effects. We also would like to acknowledge the helpful comments made about this chapter by Caroline Roberts, City University, London.
Jowell-Chapter-02.qxd
34
3/9/2007
8:04 PM
Page 34
MEASURING ATTITUDES CROSS-NATIONALLY
Equivalence of samples Although many surveys over the years have been used to make crossnational comparisons, such uses were often an after-thought. In very few cases have surveys been specifically designed to facilitate cross-national comparisons. Only in the last decade or so has it become recognised that design objectives – and therefore methods – for cross-national surveys should be somewhat different from those for national surveys. In 1997, Leslie Kish acknowledged this shifting perception: Sample surveys of entire nations have become common all over the world during the past half century. These national surveys lead naturally to multinational comparisons. But the deliberate design of valid and efficient multinational surveys is new and on the increase. New survey methods have become widespread and international financial and technical support has created effective demand for multinational designs for valid international comparisons. The improved technical bases in national statistical offices and research institutes have become capable of implementing the complex task of coordinated research. However, to be valid, multinational surveys have to be based on probability sample designs of comparable national populations, and the measurements (responses) should be well controlled for comparability. (Kish, 1997 p.vii) It may seem self-evident that samples should be equivalent for any comparative survey, but it is less evident what ‘equivalent’ means in practice. The ESS approach is to define equivalence in terms of two fundamental characteristics of a survey sample: the population that it represents and the precision with which it can provide estimates of the characteristics of that population. Our first step was to define the population that each national sample should represent and to insist that the definition was adhered to in each country so that the sample design provided complete, or very near-complete, coverage of that population. The definition of the population was, “all persons aged 15 years or older resident in private households within the borders of the nation, regardless of nationality, citizenship, language or legal status”. The second step was to define what the word ‘represent’ should mean. We concluded that this should mean that each sample would be capable (at least in the absence of non-response) of providing statistically unbiased estimates of characteristics of the population. The only way to guarantee this is to use a strict probability sampling method, where every person in the population has a non-zero chance of being selected, and the selection probability of every person in the sample is known.
Jowell-Chapter-02.qxd
3/9/2007
8:04 PM
Page 35
How representative can a multi-nation survey be?
35
But providing unbiased estimates is necessary but not sufficient for comparative research. The estimates also have to be sufficiently precise to be useful. Broadly, they should be precise enough to detect all but minor differences between nations. However, to assess the implication of this requirement for sample sizes would require separate considerations not only of each estimate of population variability, but also of the likely magnitude of between-nation differences. For a survey such as ESS that asks a wide variety of questions, this would be an enormous task. And even if it could be achieved, the conclusion would be different for each estimate. So instead, our approach was to consider a single imaginary estimate with “typical” properties and then to identify the minimum precision that would be required for such a variable. On this basis, it was agreed that precision equivalent to a simple random sample (SRS) of 1,500 respondents would be sufficient for most purposes without being prohibitively expensive. The strategy from then on was to ensure that this sort of precision was achieved in each nation. But since the design in most nations would involve departures from simple random sampling, for instance clustering and variable selection probabilities, it was also necessary to predict the impact of these departures on precision (known as ‘design effects’) before determining the required nominal sample size in each nation. This is a very unusual step for a cross-national survey to take at the specification stage. Typically, such surveys content themselves with specifying the same nominal sample size for each country (i.e. the same number of interviews) under an implicit assumption that this is what determines the precision of estimation. Instead, the ESS introduced a further innovation by implementing a standard way of predicting design effects in each country and then requiring each nation to aim for the same effective (rather than nominal) sample size. Design effects will, of course, differ for different estimates, depending on how they are associated with selection probabilities, strata and weights. So, to ensure consistent decisions across nations we determined that design effects should be predicted with respect to an imaginary estimate with standard properties, in particular that the estimate is not correlated with selection probabilities or strata and is modestly correlated with clusters (an intra-cluster correlation of 0.022). Although this may be a reasonable approximation to the properties of many ESS variables, there would, of course, also be many variables for which the approximation did not hold. But the strategy has the advantage both of simplicity (as we shall see later, it is not too demanding to predict design effects once these simplifying assumptions have been made) and of consistency (the same 2
Alternative intra-cluster correlation predictions were permitted provided that these could be justified based on estimates from national surveys. In practice, almost all participating nations used the default value of 0.02.
Jowell-Chapter-02.qxd
3/9/2007
36
8:04 PM
Page 36
MEASURING ATTITUDES CROSS-NATIONALLY
basis for decisions is used in every country). The strategy is evaluated in Lynn et al. (forthcoming). Sample sizes It is perhaps worth emphasising why equal precision in each nation is a desirable goal. It stems from the idea that a key objective of cross-national research is to make comparisons between nations, whether in terms of simple descriptive statistics, associations between variables, or complex multivariate statistics. For any given overall budget, the precision of an estimate of the difference between parameters for two nations is maximised if the precision of the estimates of the two individual parameters is approximately equal (assuming that precision comes at a similar price in each nation). Figure 2.1 The relationship between precision of an estimated difference between nations and relative national sample sizes, assuming simple random sampling
Standard error of estimate of difference
4 3.5 3 2.5 2 1.5 1 0.5 0
0.2
0.4
n2/n
0.6
0.8
1
Equal costs and variances Equal cost, variance 50% greater in nation 1 Equal variance, cost double in nation 2
This can be illustrated by a simple example. Suppose we want to compare the mean of a variable y between two nations based on SRS in each nation. If the population variability of y is similar in each nation, then precision depends only on sample size. Figure 2.1 shows that, if the cost per sample
Jowell-Chapter-02.qxd
3/9/2007
8:04 PM
Page 37
How representative can a multi-nation survey be?
37
member is the same in each nation, maximum precision is obtained by having the same sample size in each nation (the lowest of the three lines on the graph). Reassuringly, the graph is fairly flat in its central region, indicating that slight departures from equal sample size in each nation will make little difference to the precision of comparisons. The graph also shows what happens to the precision of an estimated difference between nations if the assumptions of equal population variance or equal costs do not hold. The graph becomes asymmetrical, implying that a larger proportion of the sample should be devoted to the nation with the lower data collection costs or the higher variability in its population. However, provided that the differences in costs or variance are not too great, an equal sample size in each country will still provide precision that is almost as good as the maximum theoretically obtainable. In any case, in a cross-national survey such as the ESS, it would not be possible to vire data collection costs between nations. So it is not appropriate to consider costs when determining sample allocation across nations. Achieving equivalence It is one thing to have a goal of equivalent samples and a concept of what equivalence should mean. It is quite another to achieve equivalence in practice. The experiences of previous multi-nation surveys (e.g. Park and Jowell, 1997) led us to believe that the process by which sample designs are developed, agreed and implemented is at least as important as the specification to which designs should adhere. Consequently, we developed a detailed process that we felt would minimise the chances of ending up with sample designs that failed to meet the specification or that were sub-optimal in other preventable ways. We recognised that this process would require considerable resources, but felt it to be justified given that the sample is one of the foundation stones on which any survey is constructed. So, a small panel of sampling experts was set up from the start – the “ESS sampling panel”. The panel is responsible for producing the sample specification, agreeing the actual sample design for each nation and reviewing the implementation of the designs. Two features of the operation of the panel are notable. First, its approach is co-operative rather than authoritarian. The sampling panel sees its primary role as providing advice and assistance where needed, even though it also has to sign off and ‘police’ the sample designs. Second, its interactions are intensive. Regular contact is made between relevant parties, promoting rapport and co-operation. To our knowledge, this process is unique among cross-national social surveys, and is described in more detail later in this chapter. An important feature of the process is that a particular member of the sampling panel is allocated to each nation and the goal is to develop close working
Jowell-Chapter-02.qxd
3/9/2007
8:04 PM
38
Page 38
MEASURING ATTITUDES CROSS-NATIONALLY
arrangements with the respective National Coordinator – a process that often leads to an extended dialogue which seems to be much appreciated by both parties. We now describe each element of the ESS’s sampling procedure. Population coverage We have already referred to the definition of the ESS target population. As it turned out, however, this seemingly simple definition proved surprisingly difficult to apply in several nations. Some nations were not used to working with a lower age limit as low as 15; others were used to applying an upper age limit, typically of either 75 or 80. Even so, the ESS’s age specification (15+, no upper limit) has been strictly adhered to in all countries. The lower limit of 15 proved to be especially difficult for two countries (Italy and Ireland) which used their electoral registers as a sampling frame. So in these countries the registers were used just as a frame of households or addresses, with interviewers implementing a special procedure to sample from all persons resident at each selected address. In some countries (e.g. Ireland and Poland), 15–17 year olds were only able to be interviewed with parental consent. Another important aspect of the population definition was that a person’s first language should not be an undue barrier to selection or participation. Thus questionnaires had to be made available in all languages spoken as a first language by five per cent or more of the population and interviewers had to be available to administer them. In some countries it turned out that ‘complete’ geographical coverage of the population would be too costly or even too dangerous to achieve. For instance, on cost grounds Jersey, Guernsey and the Isle of Man were excluded from the United Kingdom sample, just as Ceuta and Mallila were excluded from the sample for Spain, and the smaller islands were excluded from the sample for Greece. In Round 1, the Palestinian residents of (East) Jerusalem were excluded from the sample for Israel because at that time it would have been too dangerous for interviewers to work there. And in some countries where the sampling frame was a population register it proved impossible to include illegally resident persons. Such deviations from the ideal were discussed and agreed in advance with the sampling panel, which ensured that there were no serious deviations from the definition of the target population. Sampling frames An important prerequisite for sampling is the availability of a suitable sampling frame. In many countries it was a major challenge to find a regularly
Jowell-Chapter-02.qxd
3/9/2007
8:04 PM
Page 39
How representative can a multi-nation survey be?
Table 2.1
39
Sampling frames used on ESS Round 2
Country
Frame
Remarks
Austria
Telephone book
Additional non-telephone households were sampled in the field, as described in the text below
Belgium Czech Republic
National Register Address register UIR-ADR
Denmark Estonia Finland France Germany
Danish Central Person Register Population Register Population Register None Registers from local residents' registration offices None Central Registry National Register National Electoral Register Social Security Register Postal address list (TPGAfgiftenpuntenbestand) BEBAS Population Register National Register of Citizens (PESEL) None Central Register of Citizens Central Register of Population Continuous Census Register of population Telephone register Clusters of addresses Postal address list (Postcode Address File) None
Greece Hungary Iceland Ireland Luxembourg Netherlands Norway Poland Portugal Slovakia Slovenia Spain Sweden Switzerland Turkey UK Ukraine
Used to select streets, followed by field enumeration
Area-based sampling
Area-based sampling
Area-based sampling
Area-based sampling
updated, complete and accessible frame. For instance, whereas Austria has a regularly updated computer-based population register, it is not permitted for it to be used as a sampling frame in social surveys. To illustrate the diversity of available frames, Table 2.1 lists the frames used in Round 2 for the selection of individuals, households or addresses. Almost all countries that participated in both of the first two rounds of the ESS used the same frame on both occasions. There were, however, two exceptions: the Czech Republic and Spain. In the Czech Republic, it turned out that the frame used in Round 13 did not meet expectations in respect of coverage and updating.
3
The “SIPO” database of households. This is compiled by merging utility lists of households that subscribe to electricity, gas, radio, television or telephone.
Jowell-Chapter-02.qxd
3/9/2007
40
8:04 PM
Page 40
MEASURING ATTITUDES CROSS-NATIONALLY
It was thus changed in Round 2. In Spain, we took advantage of the fact that it was possible to change from a frame of households (in Round 1) to one of individuals (in Round 2). In five countries (Czech Republic, France, Greece, Portugal and Ukraine), it proved impossible to find an appropriate frame, and in each case area-based designs had to be deployed. Sample designs We have already mentioned that to meet the ESS objectives in respect of bias and precision, it was essential to use strict probability samples everywhere. Even so, the nature of the sample designs were to vary greatly between nations. That in itself is, of course, not a problem since to achieve equivalence of outcomes it is not necessary to use identical inputs. But it did mean that the sampling panel had to exert careful control over all designs to ensure that they really were comparable. In Table 2.2 we give an overview of the designs applied in Round 2 of the ESS. It can be seen that the designs varied from simple random sampling (e.g. Denmark) on the one hand to four-stage stratified, clustered designs (e.g. Ukraine) on the other. Moreover, while in some cases a sample of persons could be selected directly from the frame (indicated by a P in the Units column of Table 2.2), in other cases – where the units selected were households or addresses – the final task of selecting an individual to interview had to be carried out in the field. Table 2.2 can thus provide only a summary of the variation in designs. To illustrate the complexity of some of the designs implemented, we now describe two of them in more detail. In Austria the only available sampling frame is the telephone book. But this covers only abut 90 per cent of households. Not covered are households without a fixed-line telephone and households with secret numbers. To give these latter groups a chance of selection we developed the following design. Firstly, the Austrian municipalities were sorted into 363 strata, formed from 121 districts and three classes of population sizes (small municipalities with less than 2,500 inhabitants, medium municipalities with 2,500 to less than 10,000, and large municipalities with 10,000 inhabitants or more). At stage 1 the Primary Sampling Units (PSUs) were selected. These are clusters in municipalities. The number of clusters in a stratum was proportional to the size of its target population. The allocation was done by controlled rounding (Cox, 1987). Within a stratum, clusters were selected by systematic proportional to size random sampling. At stage 2, five addresses of households in the telephone book were drawn in each cluster. These formed the first part of the sample. To include also the non-listed households, the interviewer took each “telephone household” as a starting point and identified the tenth subsequent household in the field (according to a specified rule for the random walk). The five households found with this method formed the second part of the sample.4 Then, at stage 3 an individual was selected at each address using the next birthday method. 4
This method gives telephone households twice the chance of selection of nontelephone households, so this was taken account of by weighting.
Jowell-Chapter-02.qxd
3/9/2007
8:04 PM
Page 41
How representative can a multi-nation survey be?
Table 2.2
Overview of ESS sample designs in Round 2
Country Austria Belgium Czech Republic Denmark Estonia Finland France Germany Greece Hungary Iceland Ireland Luxembourg Netherlands Norway Poland Portugal Slovakia Slovenia Spain Sweden Switzerland Turkey UK Ukraine
41
Design strat, clus, 3 stages, 360 pts Cities: srs Rest: strat, clus, 2 stages, 324 pts strat, clus, 4 stages, 275 pts Srs Systrs Systrs strat, clus, 3 stages, 200 pts strat, clus, 2 stages, 163 pts strat, clus, 3 stages, 528 pts Cities: srs Rest: strat, clus, 2 stages Srs strat, clus, 3 stages, 250 pts Stratrs Stratrs Srs Cities: srs Rest: strat, clus, 2 stages, 158 pts strat, clus, 3 stages, 326 pts Srs strat, clus, 2 stages, 150 pts strat, clus, 2 stages, 503 pts Srs strat, clus, 3 stages, 287 pts strat, clus, 3 stages, 200 pts GB: strat, clus, 3 stages, 163 pts NI: srs strat, clus, 4 stages, 300 pts
Units
ngross
nnet
neff
H
3672
2556
1390
P A P P P A P A
3018 4333 2433 2867 2893 4400 5868 3056
1778 3026 1487 1989 2022 1806 2870 2406
1487 773 1487 1989 2022 1114 1288 1469
P P A P A P
2463 1200 3981 3497 3009 2750
1498 579 2286 1635 1881 1760
?5 579 936 1419 1568 1761
P A P P P P A A
2392 3094 2500 2201 3213 3000 4863
1717 2052 1512 1442 1663 1948 2141
1078 787 1512 952 1176 1948 1398
A A
4032 3050
1897 2031
1123 600
Notes: strat: stratified. clus: clustered. srs: simple random sample. systrs: systematic random sample. stratrs: stratified (unclustered) random sample. stages: number of stages of selection. pts: number of sample points. H: household. P: person. A: address. The sample of Turkey was not included in data release 2 so exact sample sizes are not yet known
The Austrian example shows that even if no complete frame is available it is possible to implement a probability sample design. Moreover, Austria is an example of a country in which we were able to improve the design at Round 2 compared to Round 1, based on the evidence of Round 1. Having found a high level of homogeneity within sample points at Round 1 which resulted in a relatively large design effect due to clustering, we decided to increase the number of clusters
5
The Hungarian team used a sample design that was not agreed with the sampling expert panel and that differs from the signing off form. This design led to huge design effects due to varying inclusion probabilities (see Table 2.3) and due to clustering (see Table 2.4). However, owing to a personnel change in the statistical department it was not possible to clarify details of the sample design subsequently. An improvement of the design in Round 3 is inevitable.
Jowell-Chapter-02.qxd
42
3/9/2007
8:04 PM
Page 42
MEASURING ATTITUDES CROSS-NATIONALLY
from 324 (in Round 1) to 360 (in Round 2). In addition, the clustering (and resultant design effect) was further reduced by requiring the interviewer to visit only the tenth household after the starting address instead of the fifth. A second rather complex design is the one in Ukraine. Since probability sampling is not the usual method there and because no population register is available as a sampling frame it was challenging to find an acceptable design. The first stage of the agreed design was to stratify ‘settlement’. The strata were defined by 11 geographic regions and seven types of settlements. Altogether, there proved to be 56 non-empty strata. 300 sample points were allocated to the strata in proportion to the size of the stratum population using the Cox method. In some cases, the stratum consisted of a single settlement (large city), while in others it was necessary to select settlements with probability of selection proportional to population size with replacement. At stage 2, streets were selected as sample points using simple random sampling. For this, a register of streets within settlements was available as a sampling frame. Unfortunately, however, this register did not contain any information on the number fof addresses or households in each street. So selecting streets with equal probabilities was the only possibility. At stage 3 the interviewers listed the dwellings in each sampled street, excluding any that were obviously vacant. The lists were returned to the central office, where an appropriate number of selections were made from each street using systematic random sampling with a fixed sampling fraction. Finally, at stage 4, one eligible person was selected by the interviewer using the last birthday method. Table 2.2 also shows that the required minimum effective sample size, neff = 1500, was not reached in a number of countries, such as the Czech Republic, Ireland, Portugal or Ukraine. The reason for this in many cases was that the intra-class correlation turned out to be higher than predicted. For instance, in Portugal we found a median ρ =0.16 in a set of eight selected variables, a large deviation from the value of ρ=0.02 that we had initially suggested. But this suggestion had emerged from an analysis of the British Social Attitudes Survey which includes several items similar to ESS items. A similar level of intra-class correlation (ρ=0.04) is also reached in the ESS as far as the UK and three other countries are concerned. But there seem to be marked cross-cultural differences in the amount of homogeneity within clusters. In largely rural countries such as Portugal, clustering effects turn out to have a bigger impact than they do in highly urbanized countries such as Germany. Another source of high homogeneity in certain countries could be fieldwork practices, but we cannot yet be sure. In particular, the design effect due to interviewer effects has to be considered, and it may still be that in future rounds that the minimum effective sample size neff =1500 is too demanding to sustain as a strict requirement.
Jowell-Chapter-02.qxd
3/9/2007
8:04 PM
Page 43
How representative can a multi-nation survey be?
43
Design weights It is clear that the designs we described for Austria and the Ukraine are not equal probability designs. The same was true for many of the other ESS designs too. So it was vitally important for the selection probabilities at each stage to be recorded. These records were subsequently used by the sampling panel to calculate design weights which were an inverse of the product of the inclusion probabilities at each stage. In Round 1, the weights for all respondents were equal only in Belgium, Denmark, Finland, Sweden, Hungary and Slovenia. In Round 2, the same was true for Estonia, Iceland, Norway and Slovakia. 6 These countries used equal probability selection methods. In all other countries, however, selection probabilities (and hence design weights) varied to some degree for the following reasons: • Unequal inclusion probabilities in countries where frames of households or addresses were used, such as the United Kingdom and Greece. For example, persons in households with four persons aged 15 years and older had a selection probability one-quarter that of a person in a single adult household. • Unequal inclusion probabilities because of over-sampling in some strata. An example was Germany, where because separate analyses were required for eastern and western parts of Germany, East Germans were over-sampled. As a result, 1020 persons in East Germany and 2046 persons in West Germany were interviewed even though the actual proportion of East Germans is only around 20 per cent. In other countries, such as the Netherlands and Poland, sampling fractions were varied slightly over strata in anticipation of differences in response rates. When analysing the data it is, of course, important to use the design weights that accompany the dataset, which have been calculated to correct for these variations in selection probabilities. Unless they are used, some samples will turn out strangely, for instance heavily over-representing East Germans or persons living alone, according to the above examples. In Spain, for instance, before introducing the design weights, some 14.9 per cent of respondents apparently lived in single-person households, but – after properly applying them – this proportion reduces to 6.1 per cent. Figure 2.2 gives an overview of the variation in design weights calculated for Round 2. The differences between nations in the shape of the distribution is striking. The nine nations with equal-probability designs stand out as consisting of a single vertical bar.
6
In Round 2 Hungary changed to an unequal probability sampling scheme.
Jowell-Chapter-02.qxd
3/9/2007
44
Figure 2.2
8:04 PM
Page 44
MEASURING ATTITUDES CROSS-NATIONALLY
Distribution of design weights in Round 2
Note: Round 2 data from Turkey was not included in data release 2 and no sample design file has been received, so exact sample sizes, design weights and design effects are not yet known. Therefore, Turkey is excluded from this and the following tables and figures
Design effects We have argued that it is necessary and even desirable to allow variation in sample designs in order to achieve equivalence. However, the more a
Jowell-Chapter-02.qxd
3/9/2007
8:04 PM
Page 45
How representative can a multi-nation survey be?
45
sample is clustered and the greater the variation in its inclusion probabilities, the less “effective” it is. In other words, more interviews need to be conducted to obtain the same precision of estimates when a complex design is used as compared with simple random sampling. As noted, we can measure this loss in precision by the design effect (Kish, 1965). The variation between nations in sample designs meant that there was likely to be considerable variation in design effects too. That is why we tried to predict the design effects and to use them to determine the number of interviews that would be needed to meet the criterion of a minimum effective sample size of 1500. For the prediction of design effects we chose a model-based approach that takes into account two components: • Design effect due to differing selection probabilities (DEFFp) If differing selection probabilities were to be used, then the associated design effect was predicted using the following formula: m
mi wi2
i
DEFF p =
mi wi
2
where there are mi respondents in the ith selection probability class, each receiving a weight of wi. An overview of the predicted and estimated design effects due to differing inclusion probabilities for both Rounds 1 and 2 of the ESS is given in Table 2.3. The post-fieldwork estimated design effects have been calculated in exactly the same way as the pre-fieldwork predictions, but using the realised design weights and sample sizes rather than the anticipated ones. In other words they apply to our imaginary variable that is not correlated with strata or clusters. In most countries the predicted design effects were rather close to the realised design effects in Round 1. Only in Portugal and Israel did we notably under-predict the design effect. But in Round 2 we were able to improve on our predictions, and in some cases on the design too, based on the Round 1 data. This led to a lower anticipated design effect in a number of countries (Austria, France, Ireland, Luxembourg). Indeed, in three of those four countries the estimated design effects were equal to the predictions or even lower. Meanwhile, in Spain and Norway a new design was used that did not involve varying inclusion probabilities, so DEFFp=1. And in Portugal a remarkable decrease of DEFFp took place, due to an improvement in the design. Only in the Czech Republic and the Ukraine were there increases in the design
Jowell-Chapter-02.qxd
3/9/2007
46
8:04 PM
Page 46
MEASURING ATTITUDES CROSS-NATIONALLY
effects due to differing selection probabilities and in the Ukraine, this was a result of new sample designs.7 Table 2.3
Design effects due to differing selection probabilities; Rounds 1 and 2
Country Austria Belgium Czech Republic Denmark Estonia Finland France Germany Greece Hungary Iceland Ireland Israel Italy Luxembourg Netherlands Norway Poland Portugal Slovakia Slovenia Spain Sweden Switzerland UK Ukraine
Predicted DEFFp Round 1
Estimated DEFFp Round 1
Predicted DEFFp Round 2
Estimated DEFFp Round 2
1.4 1.0 1.2 1.0
1.2 1.0 1.2 1.0
1.0 1.3 1.1 1.2 1.0
1.0 1.2 1.1 1.2 1.0
1.3 1.3 1.0 1.4 1.2 1.2 1.0 1.1
1.0 1.6 1.2 1.3 1.2 1.0 1.0 1.8
1.3 1.0 1.5 1.0 1.0 1.0 1.2 1.1 1.2 1.0 1.0 1.0
1.3 1.0 1.5 1.0 1.0 1.0 1.2 1.1 1.2 2.2 1.0 1.3
1.0 1.1 1.0 1.2 1.2
1.0 1.2 1.0 1.2 1.2
1.3 1.2 1.0 1.0 1.2 1.0 1.0 1.0 1.0 1.2 1.2 1.2
1.2 1.2 1.0 1.0 1.4 1.0 1.0 1.0 1.0 1.2 1.3 3.4
Note: Figures are rounded to one decimal place
• Design effect due to clustering (DEFFc) In most countries multi-stage, clustered, sample designs were used. In such situations there is also a design effect due to clustering, which can be calculated as follows: DEFFc = 1 + (b-1) ρ where b is the mean number of respondents per cluster (but see Lynn and Gabler, 2005) and ρ is the intra-cluster correlation (or “rate of homogeneity”) – a measure of the extent to which persons within a clustering unit are more 7
We do not discuss the case of Hungary here (see footnote 4).
Jowell-Chapter-02.qxd
3/9/2007
8:04 PM
Page 47
How representative can a multi-nation survey be?
47
homogeneous than persons within the population as a whole (Kish, 1995) – see Table 2.4. In the first round this design effect could be predicted, at least crudely, from knowledge of other surveys and/or the nature of the clustering units. In some countries calculations were made to estimate intra-class correlation coefficients from earlier surveys with similar variables. If there was no available empirical evidence upon which to base an estimate of the intra-class correlation coefficient, then a default value of 0.02 was used, and in fact this value was used in most countries. However, as noted, this turned out in many countries to be an under-prediction of the actual homogeneity within clusters. Table 2.4
Design effects due to clustering: Rounds 1 (R1) and 2 (R2)
Country Austria Belgium Czech Republic Denmark Estonia Finland France Germany Greece Hungary Iceland Ireland Israel Italy Luxembourg Netherlands Norway Poland Portugal Slovakia Slovenia Spain Sweden Switzerland UK Ukraine
Predicted DEFFc R1
Estimated DEFFc R1
Predicted DEFFc R2
Estimated DEFFc R2
1.1 1.1 1.1 1.0
1.6 1.2 1.3 1.0
1.0 1.2 1.4 1.1 1.2
1.0 1.3 2.0 1.6 1.4
1.2 1.2 1.1 1.0 1.2 1.2 1.1 1.1
1.9 2.4 1.8 1.0 1.0 1.6 1.8 1.6
1.2 1.2 1.3 1.0 1.0 1.0 1.2 1.7 1.1 1.0 1.0 1.3
1.5 1.2 2.6 1.0 1.0 1.0 1.4 2.0 1.4 3.3 1.0 1.9
1.4 1.2 1.0 1.2 1.3
1.3 1.6 1.0 1.3 1.4
1.0 1.0 1.0 1.1 1.2 1.0 1.4 1.3 1.0 1.1 1.3 1.1
1.0 1.0 1.0 1.6 1.9 1.0 1.4 1.4 1.0 1.2 1.3 1.7
Note: Figures are rounded to one decimal place
Since the number of respondents per cluster (b) and the intra-class correlation (ρ) both influence the design effect we were – based on estimated design effects from Round 1 – able to suggest improvements of the designs in Round 2 in some countries. The improvements were made either through a reduction of b by increasing the number of sample points, or through increasing the size
Jowell-Chapter-02.qxd
3/9/2007
48
8:04 PM
Page 48
MEASURING ATTITUDES CROSS-NATIONALLY
of the sample points in the hope that this would lead to a reduction in ρ. So in Switzerland we increased the number of sample points from 220 to 287, in France from 125 to 200, and in Portugal from 150 to 326. But such changes were not always possible, because they tend to increase fieldwork costs. Some estimates of and the realised values of b are presented in Table 2.5 for two of the ESS’s attitude scales. The differences between nations in the extent to which attitudes seem to cluster in the population is striking. On the left–right dimension of political values, for instance, small areas seem to be quite homogeneous in Austria, the Czech Republic, Greece, Ireland, Slovenia, Spain and especially in Ukraine, but they are relatively heterogeneous in France, Hungary, Poland and the UK. A similar picture emerges with the life satisfaction scale. Austria, the Czech Republic, Greece, Ireland, Portugal, Spain and Ukraine are again amongst those with large intra-cluster correlations, but Slovenia is now missing from this group. Table 2.5
Estimated ρ for two measures, Round 2 Left–right-scale
Country Austria Belgium Czech Republic France Germany Greece Hungary Ireland Poland Portugal Slovenia Spain Switzerland UK Ukraine
Satisfaction with life
Intra-class correlation coefficient ρ
Mean cluster – size b
Intra-class correlation coefficient ρ
0.10 0.05 0.09 0.03 0.05 0.08 0.03 0.08 0.00 0.13 0.10 0.10 0.06 0.04 0.27
5.5 5.0 9.5 8.7 16.0 4.1 23.2 10.3 4.2 4.9 6.4 3.8 7.3 10.2 6.8
0.10 0.03 0.09 0.04 0.07 0.09 0.05 0.09 0.07 0.22 0.04 0.11 0.04 0.05 0.09
Mean cluster – size b 6.2 5.4 11.1 9.3 17.6 4.7 27.5 12.2 4.5 6.4 8.8 4.0 7.9 11.3 9.4
– Note: Differences in b between the two measures are due to differences in the level of item non-response
The total design effect is the product of the design effect due to differing selection probabilities (DEFFp) and the design effect due to clustering (DEFFc). These total design effects vary largely between countries. It is also instructive to note that a change in sample design within a country is always reflected in the design effect. When improvements are made – such as an increase of PSUs or the application of a less complex design – this leads to a
Jowell-Chapter-02.qxd
3/9/2007
8:04 PM
Page 49
How representative can a multi-nation survey be?
49
reduction of DEFF. In this respect, design effects seem to be a good measure for the quality of samples. Once the design effect had been predicted for a country, a target minimum net sample size (number of completed interviews) was set for that country, so as to produce approximately equal precision for all countries (see the last three columns of Table 2.2). Moreover, as can be seen in Table 2.2, we were in most but not all cases successful with our predictions. But how was this calculation of net sample sizes carried out? Sample size For the calculation of the sample sizes we used the following formula: nnet = neff * DEFFc * DEFFp = 1500 * DEFF ngross = nnet / (ER * RR) where nnet is net sample size, neff is effective sample size, fixed at 15008, ngross is gross sample size, ER is eligibility rate and RR is response rate. To illustrate these formulae we examine the case of Greece (Round 2). In Greece we had estimated from Round 1 data a design effect due to differing inclusion probabilities within the households as DEFFp = 1.22. (We had previously predicted DEFFp = 1.18 at Round 1 based on external data but, as with many countries, were now able to use Round 1 data to inform our prediction for Round 2.) The prediction of DEFFc was based upon the anticipated mean number of interviews per sample point (4) and predicted (0.04, predicted at Round 1 because the units used as sample points were smaller than in most countries and therefore might be expected to be more homogeneous). So, DEFFc = 1+ (4-1) * 0.04 = 1.12. Thus DEFF = 1.22 * 1.12 = 1.37. Given the target minimum effective sample size (neff = 1500) and the design effect (DEFF =1.37), the net sample size has to be at least nnet = 1500 * 1.37 = 2055. Since about one per cent ineligibles and a response rate of 70 per cent were predicted (from Round 1), the gross sample size should have been at least ngross = 2055/(0.70 * 0.99) = 2965. In fact, the gross sample size was set at 3,100 to allow for a margin of error. This should have meant that the effective sample size target was met so long as the design effect turned out not to exceed 1.43 (if response rate assumptions held) or so long as the response rate exceeded 67 per cent.
8
Except for “small” countries with populations of less than 2 million, for whom the minimum neff was 800 on grounds of affordability.
Jowell-Chapter-02.qxd
50
3/9/2007
8:04 PM
Page 50
MEASURING ATTITUDES CROSS-NATIONALLY
In contrast, it is, of course, much easier to calculate sample size in countries with simple random samplisng, since DEFF=1.0. For example, in Sweden it was decided to select 3000 individuals to form the gross sample. Since the response rate was expected to be 75 per cent and the number of ineligibles about 2.3 per cent, the net sample size nnet = (3000*0.977) * 0.75 = 2198. This means the effective sample size is nnet /DEFF=2198/1.0=2198, which is clearly higher than the required minimum effective sample size of neff = 1500. To reach this benchmark Sweden would have needed only a gross sample size of ngross = nnet /(ER*RR) = 1500/(0.977*0.75) = 2048 individuals. Organisation of the work As noted, a sampling panel was set up to produce the sample specification, to develop the design for each nation in co-operation with the National Coordinator, to agree the final sample designs and to review the implementation of the designs. The panel consists of: • Sabine Häder (Centre for Survey Research and Methodology, Germany) – Chair • Siegfried Gabler (Centre for Survey Research and Methodology, Germany) • Seppo Laaksonen (University of Helsinki, Finland) • Peter Lynn (University of Essex, UK). During each of the first two rounds of the survey the panel met three times. At the first meeting each panel member was assigned about six nations with which to liaise. Since the teams (panel member, National Coordinator, statisticians in survey institutes) had all co-operated so successfully at Round 1, the same assignment was maintained as far as possible in Round 2. Nations joining the survey for the first time at Round 2 were assigned to one or other panel member to ensure that the allocation of work would be as equal as possible. In the first place, panel members would contact ‘their’ National Coordinators asking for information about the foreseen sample design. Thereafter, a cooperative process of discussion and decision making between the National Coordinators, the survey organisations and the panellists got under way. As noted, with the knowledge gained from the designs applied in Round 1 we were able to improve the sample design in several nations at Round 2. At the second panel meeting in each round the panel addressed practical problems occurring in the different countries as well as theoretical
Jowell-Chapter-02.qxd
3/9/2007
8:04 PM
Page 51
How representative can a multi-nation survey be?
51
questions affecting the calculation of design effects. (For example, we found that theory did not exist for estimation when substantively different designs were used in different domains within a nation, so we had to develop the theory ourselves – Gabler et al., 2006.) Issues such as the possible future inclusion of non-response weights in the ESS were also discussed. In some cases panel members had made visits to one or more of ‘their’ countries for a detailed discussion of problems – whether an expected low response rate, or the selection of an appropriate survey organisation, or the development of a completely new design (e.g. Portugal, Italy, Switzerland, France). Once all questions were clarified and resolved, a country’s design was considered to be “ready for signing off”. A pre-designed form was used for the purpose containing full details of each design and other details that derived from the discussions between panellists, National Coordinators and survey organisations. Only when all panel members agreed a design was it ‘signed off’. Otherwise, the discussion with the National Coordinator would carry on until all the perceived problems were resolved. When fieldwork was complete, the panel members guided National Coordinators on how to create the sample design data files which would (after merging the file with the substantive data) be used to compute design weights and estimate design effects. Conclusion The ESS represents a significant step forward in the control of sample design in multi-nation surveys. We believe that the detailed sample design specification we have developed is based on clear principles, is appropriate for cross-national comparisons, and is capable of consistent implementation. We have provided clear guidance on the important role of predicted design effects in sample design and have demonstrated the benefit of a co-operative process of design in the pursuit of equivalence between countries. Naturally by no means everything has gone precisely according to plan. Some designs have failed to meet the specification owing to budgetary constraints or poorer than anticipated response rates which have reduced the ‘ideal’ sample size. But in Round 1, our predictions of design effects proved inaccurate in only two cases. Overall, we are able to report that the sampling strategy developed for the ESS has produced genuinely comparable samples across countries in all important respects, and that the samples without exception provide reasonable precision of estimation.
Jowell-Chapter-02.qxd
3/9/2007
52
8:04 PM
Page 52
MEASURING ATTITUDES CROSS-NATIONALLY
References Cox, L.H. (1987), ‘A constructive procedure for unbiased controlled rounding’, Journal of the American Statistical Association, 82 (398), pp.520–524. Gabler, S., Häder, S. and Lynn, P. (2006), ‘Design effects for multiple design samples’, Survey Methodology, 32 (1), pp.115–120. Kish, L. (1965), Survey Sampling, New York: Wiley. Kish, L. (1995), ‘Methods for design effects’, Journal of Official Statistics, 11 (1), pp.55–77. Kish, L. (1997), ‘Foreword’ in: T. Lê and V. Verma, DHS Analytical Reports No. 3: An Analysis of Sample Designs and Sampling Errors of the Demographic and Health Surveys, Calverton, MD: Macro International Inc. Lynn, P. and Gabler, S. (2005), ‘Approximations to b* in the prediction of design effects due to clustering’, Survey Methodology, 31 (1), pp.101–104. Lynn, P., Gabler, S., Häder, S. and Laaksonen, S. (forthcoming), ‘Methods for achieving equivalence of samples in cross-national surveys’, Journal of Official Statistics, 22 (4). Park, A. and Jowell, R. (1997), Consistencies and Differences in a Cross-National Survey, London: SCPR.
Jowell-Chapter-03.qxd
3
3/9/2007
6:41 PM
Page 53
Can questions travel successfully? Willem E. Saris and Irmtraud Gallhofer∗
Introduction Among the most important features of the ESS is the fact that so many different checks on quality have been built into its design. As other chapters show, these checks relate not only to areas such as sampling, non-response and fieldwork, but also to the quality of the questionnaires. Without them, we would always be wary of the validity of cross-cultural comparisons through the ESS. We need to know not only about the quality of the English-language source questionnaire, but also about differences in quality across the participating countries which may arise from the way in which the same questions come across in different cultural settings. As we will show, the ESS places an unusually strong emphasis on checking data quality at its questionnaire design phases. First, every proposed new question is subjected to prior checks based on predictions of its reliability and validity and an evaluation of the expected quality of the construct it belongs to. New items are also tested in the two-country pilot surveys that are built into the design phase, which are large enough to sustain analyses of their performance in the field. But the most unusual characteristic of the ESS is its attempt to assess the comparability of its final fielded questions in all countries and languages by means of Multitrait-Multimethod experiments, which allow error structures for several items to be compared and subsequently corrected for measurement error. ∗
Willem Saris is a member of the ESS Central Coordinating Team and Professor at the ESADE Business School, Universitat Ramon Llull, Barcelona; Irmtraud Gallhofer is a member of the ESS Central Coordinating Team and senior researcher at ESADE.
Jowell-Chapter-03.qxd
3/9/2007
54
6:41 PM
Page 54
MEASURING ATTITUDES CROSS-NATIONALLY
In this chapter we focus on the series of checks built into the ESS that help to ensure the quality of its questionnaires within and across the ESS participating countries. Seven stages of questionnaire design Having determined the subject matter of both the largely unchanging core module and the two rotating modules of the questionnaire for each round of the survey, the detailed shape and content of the questionnaire is coordinated and ultimately determined by the Central Coordinating Team, in consultation with the Scientific Advisory Board, the National Coordinators and the two Question Module Design Teams. The content of the core itself was guided by a group of academic specialists in each subject area. The questionnaires at each round go through the following stages: • Stage 1 The initial aim is to ensure that the various concepts to be included in the questionnaire, always based on a set of detailed proposals – are represented as precisely as possible by the candidate questions and scales. Since subsequent data users require source material that makes these links transparent, we make available all the initial documents and the subsequent stages of evaluation and re-design which lead to the final questionnaire. • Stage 2 To achieve the appropriate quality standard, the questions and scales undergo an evaluation using standard quality criteria such as reliability and validity. Where possible, these evaluations are based on prior uses of the question in other surveys. For new questions, however, the evaluations are based on ‘predictions’ of quality that take into account their respective properties. Such predictions are based on the SPQ program.1 Naturally, validity and reliability are not the only criteria that matter. Attention is also given to other considerations such as comparability of items over time and place, anticipated item non-response, social desirability and other potential biases, and the avoidance of ambiguity, vagueness and doublebarrelled questions.
1
Some explanation of the prediction procedure will be given below. For more details refer to Saris et al. (2004a).
Jowell-Chapter-03.qxd
3/9/2007
6:41 PM
Page 55
Can questions travel successfully?
55
• Stage 3 The next step constitutes the initial translation from the source language (English) into one other language for the purpose of two large-scale national pilots. The Translation Panel guides this process, which is designed to ensure optimal equivalence between the languages. • Stage 4 Next comes the two-nation pilot itself (400 cases per country), which also contains a number of split-run experiments on question wording alternatives. Most of these split-run experiments are built into a self-completion supplement. Some interviews are, on occasions, tape-recorded for subsequent analysis of problems and unsatisfactory interactions. • Stage 5 The pilot is then analysed in detail to assess both the quality of the questions and the distribution of the substantive answers. Problematical questions, whether because they have displayed weak reliability or validity, deviant distributions or weak scales, are sent back to the drawing board. It is on the basis of these pilot analyses that the final source questionnaire is subsequently developed. • Stage 6 The source questionnaire then has to be translated into multiple languages. The process is helped by the fact that potentially ambiguous or problematic questions in the English version are annotated to expand on their intended meaning. These annotations – carried out in collaboration with the various authors of the questions – attempt to provide greater definition to, or clarification of, the concept behind the questions, and are especially useful when the words themselves are unlikely to have direct equivalents in other languages. The translation process, which is designed to ensure that the questions in all languages are optimally functionally equivalent, is discussed in chapter 4. • Stage 7 Regardless of the effort that is put into making the questions as functionally equivalent as possible in all languages, it is inevitable that certain questions will be interpreted in a different way by respondents in certain countries. This may happen either because they use the labels of the response scale in a
Jowell-Chapter-03.qxd
56
3/9/2007
6:41 PM
Page 56
MEASURING ATTITUDES CROSS-NATIONALLY
country-specific way (systematic errors), or because they are simply so unfamiliar with the concepts addressed by the question that their answers tend to be haphazard (random errors). It was with these problems in mind that we incorporated into the ESS fieldwork a supplementary questionnaire containing questions designed to elicit the extent of random and systematic error in different countries. The data from these supplementary questionnaires will ultimately allow for the correction of measurement error across countries, thus making the findings from different countries more equivalent. Background to the evaluation of questions The effect on responses of how questions are worded has been studied in depth by a number of scholars (e.g. Schuman and Presser, 1981; Sudman and Bradburn, 1982; Andrews, 1984; Molenaar, 1986; Alwin and Krosnick, 1991; Költringer, 1993; Scherpenzeel and Saris, 1997). In contrast, little attention has been paid to the difficulties of translating concepts into questions (Blalock, 1990; de Groot and Medendorp, 1986; Hox, 1997). In addition, Northrop (1947) was the first to distinguish between conceptsby-intuition and concepts-by-postulation. What he refers to as ‘concepts-byintuition’ are simple concepts whose meaning is immediately apparent. ‘Concepts-by-postulation’, or ‘constructs’, are, in contrast, concepts that require explicit definition to be properly understood. So concepts-by-intuition would include judgements, feelings, evaluations, norms and behaviours, where it is relatively clear even on the surface what is meant – such as “people tend to behave in a particular way” or “a certain group is especially likely to have a certain characteristic”. In contrast, concepts-by-postulation would require definition, such as “ethnocentrism” or “authoritarianism”. These more complicated concepts can usually be captured only by multiple items in a survey questionnaire. Attitudes used to be defined as a combination of cognitive, affective and action tendencies (Krech et al., 1962). But this conceptualisation was challenged by Fishbein and Ajzen (1975), who defined them instead on the basis of evaluations. Although these two definitions were different, both defined attitudes and it is interesting that both defined attitudes on the basis of concepts-by-intuition. Blalock (1968) had noted the gap between the language of theory and the language of research, and when he returned to the subject two decades later (Blalock, 1990), he noted that the gap had not narrowed. Accepting that such a gap was inevitable to a degree, he also argued that insufficient attention had been given to the development of concepts-by-postulation. Now a further two decades on, the ESS has paid more attention to the development of concepts by postulation. Tests have been carried out to check the structure or
Jowell-Chapter-03.qxd
3/9/2007
6:41 PM
Page 57
Can questions travel successfully?
57
dimensionality of particular concepts. But the first step in developing concepts-by-postulation is a clear view of each concept’s individual components, which are in effect a series of concepts-by-intuition. Therefore we start with a focus on the quality of these simpler concepts. Evaluation of ‘concepts-by-intuition’ Developing a survey item for a concept-by-intuition involves choices, some of which follow directly from the aim of the study and the precise measurement objective, such as whether we want from the respondent an evaluation or a description. But many other choices influence the quality of the survey item, such as the nature or structure of the question, its wording, whether response scales are involved, and the mode of data collection. Several procedures have been developed over the years to evaluate survey items before they are set in stone. The oldest and still most commonly used approach is a pre-test followed ideally by a de-briefing of the interviewers involved. Another approach, suggested by Belson (1981), and now known as ‘cognitive interviewing’, is to ask people after they have answered a question in a pre-test how they interpreted the different concepts in the item. A third approach is the use of other “think aloud” protocols during interviews. A fourth approach to assess the cognitive difficulty of a question is to refer the matter to an expert panel (Presser and Blair, 1994), or to judge its linguistic or cognitive difficulty on the basis of a specially devised coding scheme or computer program (Forsyth et al., 1992; van der Zouwen, 2000; Graesser et al., Esposito and Rothgeb, 2000a/b). Another approach is to present respondents with different formulations of a survey item in a laboratory setting in order to see what the effect of these wording changes is (Esposito et al., 1991; Esposito and Rothgeb, 1997; Snijkers, 2002). (For an overview of these different cognitive approaches, see Sudman et al., 1996.) A rather different approach is to monitor the interaction between the interviewer and the respondent through behavioural coding to see whether or not it follows a standard pattern (Dijkstra and van der Zouwen, 1982). Non-standard interactions may indicate problems related to specific concepts in the items. In all these approaches the research attempts to detect response problems, the hypothesis being that certain formulations of the same item will increase or reduce the quality of responses. But the standard criteria for data quality – notably validity, reliability, method effect and item non-response – are not directly evaluated by these methods. As Campbell and Fiske (1959) noted, the problem is that validity, reliability and method effects can only directly be evaluated by using more than one method to measure the same trait. Their design – the Multitrait-Multimethod or MTMM design – is now widely used in psychology and psychometrics (Wothke, 1996), but has also attracted the
Jowell-Chapter-03.qxd
58
3/9/2007
6:41 PM
Page 58
MEASURING ATTITUDES CROSS-NATIONALLY
attention of scholars in marketing research (Bagozzi and Yi, 1991). In survey research, the MTMM approach has been elaborated by Andrews (1984) and applied in several languages: English (Andrews, 1984), German (Költringer, 1995) and Dutch (Scherpenzeel and Saris, 1997). This same approach to evaluating questions has been applied in the ESS. But before describing the ESS approach, we must define the quality criteria we employed. Quality criteria for single survey items It goes without saying that it is desirable to minimise item-non-response in surveys. This is probably the primary criterion for evaluating survey items. Missing values disturb the analysis of answers and may lead to a distortion of the ‘true’ results. A second criterion is bias, defined as a systematic difference between the true scores of the variable of interest and the observed scores (having been corrected for random measurement error). For validatable factual variables these true scores can be obtained, giving us a benchmark against which to measure and evaluate the (corrected) observed scores. So in electoral research, for instance, the published turnout rate may be compared against survey measurements of turnout rates. It turns out, however, that surveys using standard questions on electoral participation tend to over-estimate turnout. Different formulations may thus be tested to see whether they help to improve the survey estimates. But for attitudinal variables, where the ‘true’ values are, of course, unknown, the only possibility is to study different methods that may in turn generate different distributions. And the problem is that either or neither of any two different distributions might be correct. These issues have attracted a great deal of scholarly attention (see Schuman and Presser, 1981 for a summary). Meanwhile, Molenaar (1986) has studied the same issues in non-experimental research. As noted, also relevant for any survey instrument are reliability, validity and method effect. The way these concepts can be defined in the context of surveys is illustrated in Figure 3.1, which represents a measurement model for two concepts by intuition – “satisfaction with the government” on the one hand, and “satisfaction with the economy” on the other. In this model it is assumed that: – fi is the trait factor i of interest measured by a direct question; – yij is the observed variable (variable or trait i measured by method j); – tij is the “true score” of the response variable yij; – mj is the method factor, that represents the specific reaction of people on the method and therefore generates a systematic error; and – eij is the random measurement error term for yij.
Jowell-Chapter-03.qxd
3/9/2007
6:41 PM
Page 59
Can questions travel successfully?
Figure 3.1 method
59
Measurement model for two variables measured by the same
Notes: f1,f2 = variables of interest; vij = validity coefficient for variable i; Mj = method factor for both variables; mij = method effect on variable i; tij = true score for yij; rij = reliability coefficient; yij = the observed variable; eij = the random error in variable yij
The rij coefficients represent the standardized effects of the true scores on the observed scores. This effect is smaller if the random errors are larger. So this coefficient is called the reliability coefficient. The vij coefficients represent the standardized effects of the variables one would like to measure on the true scores for those variables. (This coefficient is called the validity coefficient.) The mij coefficients represent the standardized effects of the method factor on the true scores. (This coefficient is called the method effect, and the larger it is, the smaller is the validity coefficient.) It can be shown that in this model mij2 = 1 – vij2, so the method effect is equal to the invalidity due to the method used.
Jowell-Chapter-03.qxd
60
3/9/2007
6:41 PM
Page 60
MEASURING ATTITUDES CROSS-NATIONALLY
Reliability is defined as the strength of the relationship between the observed response (yij) and the true score (tij) which is rij2. Validity is defined as the strength of the relationship between the variable of interest (fI) and the true score (tij) which is vij2. The systematic method effect is the strength of the relationship between the method factor (mj) and the true score (tij) which is mij2. The total quality of a measure is defined as the strength of the relationship between the observed variable and the variable of interest which is: (rijvij)2. The effect of the method on the correlations is equal to: r1jm1jm2jr2j. So, by looking at the effect of the characteristics of the measurement model on the correlations between observed variables, it can be seen that both the definitions and the criteria are appropriate. Using elementary path analysis, it can be shown that the correlation between the observed variables ρ(y1j,y2j) is equal to the correlation due to the variables we want to measure, f1 and f2, reduced by measurement error plus the correlation due to the method effects or as provided in the formula below: ρ(y1j,y2j) = r1jv1j ρ(f1,f2)v2jr2j + r1jm1jm2jr2j
(1)
Note that rij and vij, which are always smaller than 1, will reduce the correlation (see first term), while the method effects – if they are not zero – may generate an increase of the correlation (see second term). So if one knows the reliability coefficients, the validity coefficients and the method effects, one can estimate the correlation between the variables of interest corrected for measurement error. The Multitrait-Multimethod design The problem is that the above coefficients cannot be estimated if only one measurement of each trait is available, since in that case there would be only one observed correlation available to estimate seven free parameters. That was why the MTMM design with three traits, each measured with three different methods, was suggested by Campbell and Fiske (1959). Several such experiments have been included in the ESS. Naturally we crafted questions for the ESS interview which we considered optimal for each of the variables. But in the supplementary questionnaire, which is completed by respondents after the main questionnaire, we experimented with alternative versions of some of the same questions. For these
Jowell-Chapter-03.qxd
3/9/2007
6:41 PM
Page 61
Can questions travel successfully?
61
experiments the sample was split up randomly in two or six groups, allowing each version of each question to be tested on at least two random groups. Combining the data from these two groups, three questions about three traits were asked in different ways so that any one respondent received only one repetition of a question for the same trait. This design has been called the Split Ballot MTMM (SB-MTMM) design (Saris et al., 2004b), and it is an alternative for the classical MTMM design as used by Andrews (1984). The important difference between SB-MTMM and the classic MTMM is that in the latter case each respondent receives two repetitions of the same question, and in the former case only one repetition. Memory effects are thus smaller and the response burden lower in SB-MTMM. The disadvantage of SB-MTMM, however, is that the data matrix is incomplete, but Saris et al., (2004b) have shown that all parameters can still be estimated using multiple group analysis (Jöreskog, 1971). Predicting the quality of questions It will be clear that such experiments cannot realistically be contemplated for all questions in a questionnaire as long as the ESS’s. In any case, estimates of data quality are affected by other factors, such as the item’s position in the questionnaire, its distance from another MTMM measure, the length of its text, and so on. To be able to take these factors into account and to make predictions of the quality of questions which have not been studied explicitly one therefore needs another approach, known as the meta analysis of MTMM experiments (Andrews, 1984; Scherpenzeel and Saris 1997). By ensuring that the MTMM experiments are chosen in such a way as to cover the most important choices involved in the design of the questions, then a meta analysis of the results of these experiments can provide an estimate of the effect of those choices on the reliability, validity and method effects in the ESS as a whole. And for this purpose of evaluating the overall quality of survey instruments, the program SQP was duly developed.2 The predictions are founded on the properties of more than 1000 survey questions, apart from the questions in the ESS itself. Evaluation of ‘concepts-by-postulation’ We provide below a number of examples of checks we conducted on the ESS. For a complete description of the evaluation of all instruments, see the ESS website (www.europeansocialsurvey.org.uk). Each of these examples illustrates a different aspect of the checks performed. 2
In this case a Windows version of SQP developed by Oberski, Kuipers and Saris (2004) has been used. The latest version of the program can be obtained from the author writing to
[email protected]
Jowell-Chapter-03.qxd
3/9/2007
62
6:41 PM
Page 62
MEASURING ATTITUDES CROSS-NATIONALLY
Political efficacy The concept of political efficacy is a longstanding component of surveys in political science. Two forms of efficacy are often distinguished – beliefs about the responsiveness of the system on the one hand (external efficacy), and beliefs about one’s own competence on the other (internal efficacy). Only the second component was included in the ESS core questionnaire. We follow below the stages of design and testing that led us to the final questions. The original formulation of the three proposed political efficacy questions was as follows: How much do you agree or disagree with each of the following statements?3 1. Voting is the only way that people like me can have any say about how the government runs things. 2. Sometimes politics and government seem so complicated that a person like me cannot really understand what is going on. 3. It is difficult to see the important differences between the parties. The initial SQP analysis (Table 3.1) suggested that the first two questions were acceptable, but the third one less so due to the fact that it is a complex assertion.4 Table 3.1 Item 1 2 3
1st SQP evaluation of initial three items on political efficacy Reliability
Validity
Method effect
Total quality
.77 .76 .62
.82 .83 .76
.18 .17 .24
.63 .63 .47
But the CCT had its doubts about the concept by postulation approach in this instance because of the different premises on which the three items were based. While the first item presents a relationship, the second represents an
3
The five-point agree-disagree scale ranged through ‘disagree strongly’, ‘disagree’, ‘neither agree nor disagree’, ‘agree’, ‘agree strongly’. 4 Here we use specifications of concepts suggested by Saris and Gallhofer (forthcoming).
Jowell-Chapter-03.qxd
3/9/2007
6:41 PM
Page 63
Can questions travel successfully?
63
evaluative belief and the third a complex judgement.5 So the items were seeking to measure different concepts. And these doubts were confirmed by prior studies of the same items. Indeed, in Netherlands election studies, the first two items even did not end up in the same scale because of the low correlation between them. The specialist response to our concerns was that there had indeed been debate about the quality of these questions, and that an elaborate study by Vetter (1997) had shown the old questions to be flawed by unclear factor structures. Experiments had been conducted on other items shown below which had obtained a clearer factor structure: How much do you agree or disagree with each of the following statements? 1. I think I can take an active role in a group that is focused on political issues. 2. I understand and judge important political questions very well. 3. Sometimes politics and government seem so complicated that a person like me cannot really understand what is going on. These items were indeed more homogeneous, all being perceptions of the respondent’s own subjective competence with respect to politics. But when checking the quality of these questions using SQP, the results (shown in Table 3.2) were not at all encouraging. Table 3.2 Item 1 2 3
2nd evaluation by SQP of alternative three items on political efficacy Reliability
Validity
Method effect
Total quality
.64 .63 .65
.72 .91 .87
.28 .09 .13
.46 .57 .57
Taking into account the suggested lower quality of items which were supposed to be more homogeneous than the first group, the CCT decided to investigate further using an alternative measurement procedure. Since Saris et al. (2003) had suggested that better results can be obtained by questions with trait specific response categories than by questions with batteries of
5 Here we also use specifications of concepts suggested by Saris and Gallhofer (forthcoming).
Jowell-Chapter-03.qxd
3/9/2007
64
6:41 PM
Page 64
MEASURING ATTITUDES CROSS-NATIONALLY
agree/disagree items, a test was conducted in the pilot study using the following alternative form of Vetter’s three items: How often do politics and government seem so complicated that you can’t really understand what is going on?6 Do you think that you could take an active role in a group that is focused on political issues?7 How good are you at understanding and judging political questions?8 Table 3.3
3rd evaluation by SQP of further variation in items on political efficacy
Reliabilities method A/D 5 cat TS 5 cat [A/D 4/5 cat9 [1]
Complexity NL 0.65 0.88 0.78
GB 0.83 0.7 0.73
Active role NL 0.66 0.94 0.87
GB 0.71 0.86 0.82
Understand NL 0.69 0.86 0.82
GB 0.78 0.84 .80
The important difference between the two sets of questions is that the earlier one used the same introduction and response categories for all three items (a ‘battery’) while the one above uses different response categories for each question (a ‘trait specific scale’) (Saris et al., 2003). The contrasting results are shown in Table 3.3, where A/D means an agree – disagree scale was used, while TS means that a trait specific scale was used. This table shows that in the Netherlands the trait-specific format had a much higher reliability than the two agree – disagree formats. In Britain the size of this effect is much less clear, but it holds true for two of the three items. Given the low reliabilities in the first agree – disagree measure it is no wonder that the correlations are normally so low between these items. Based on the results of this pilot study, we decided to use the trait specific scales with categories in all countries. The correlations for these items are now much higher than they were. 6
The response categories were ‘never’, ‘seldom’, ‘occasionally’, ‘regularly’, ‘frequently’, and ‘don't know’. 7 The response categories were ‘definitely not’, ‘probably not’, ‘not sure either way’, ‘probably’, ‘definitely’ and ‘don't know’. 8 The response categories were ‘very bad’, ‘bad’, ‘neither good nor bad’, ‘good’, ‘very good’, ‘don't know’. 9 MTMM requires three methods to be used for each trait. The third one was similar to the first, but for some items a four-point scale was used in error.
Jowell-Chapter-03.qxd
3/9/2007
6:41 PM
Page 65
Can questions travel successfully?
65
The Human Values Scale A second and in principle more difficult evaluation was provided before Round 1 by the proposal to include the highly praised Human Values Scale into the ESS questionnaire (Schwartz, 1997; see also chapter 8). The scale differentiates between people’s underlying value orientations, which Schwartz describes as “affect-laden beliefs that refer to a person’s desirable goals and guide their selection or evaluation of actions, policies, people and events”. The problem we faced was that each item of the Scale contained two different ways of expressing what was essentially the same motivation or belief. All the questions refer to an imaginary person who feels or acts in different ways. But to get across both the motivational and the value components of the imaginary person, two statements are invariably employed – the first about what is important to that person and the second more about the person’s motivations. The problem for questionnaire design, however, is that it is generally advisable to avoid double-barrelled questions for the simple reason that some respondents will wish to agree with one half of the question and to disagree with the other half. So the CCT’s first reaction was to query whether one statement for each item – the clear value statement – might not suffice. The following example of a single item in the scale helps to explain the perceived problem. The 21-item scale was preceded by the following introduction: Here we briefly describe some people. Please read each description and think about how much each person is or is not like you. Put an X in the box to the right that shows how much the person in the description is like you.10 How much like you is this person? After this general introduction the different statements were read to the respondents. The one for the value of an “exciting life” was formulated as follows: He likes surprises. It is important to him to have an exciting life. The first part of this item relates to a feeling on the part of the imaginary person: “he likes surprises”. The second part is a value: “it is important to him to have an exciting life”. As suggested by Saris and Gallhofer (2005), values are ideally measured by statements that take the form of the second sentence, which express the 10
The response categories were: ‘very much like me’, ‘like me’, ‘somewhat like me’, ‘a little like me’, ‘not like me’, ‘not like me at all’.
Jowell-Chapter-03.qxd
3/9/2007
66
6:41 PM
Page 66
MEASURING ATTITUDES CROSS-NATIONALLY
importance to the individual of “an exiting life”. Any difference between the actual and observed variable would probably be attributable to measurement error alone, as indicated in Figure 3.2.
Figure 3.2
Importance and value position
The other part of the item (“he likes surprises”) relates to a feeling on the part of the imaginary person. But is such a feeling the same as a value or – more probably – a consequence of the value. There may indeed be other variables that influence the feeling, such as ‘risk aversion’. That is, two people with the same view of the importance of an exciting life might differ in their attitude to the risks inherent in such a life. This could be modelled as in Figure 3.3. So although the two measures might be the same, apart from measurement error, they might turn out to be different as a result of the intervention of a third variable – in this case risk aversion or some other similar influence. As this was an important issue to clarify in advance, we decided to conduct a pilot experiment to test if the direct value measures received the same answers as the feeling questions. Three of the Human Values Scale items were decomposed into an importance assertion and a feeling assertion and measured in the supplementary questionnaire. We present here the analysis from the Dutch pilot only, but the British pilot produced the same results. First we present the results of a standard MTMM analysis in Table 3.4.
Jowell-Chapter-03.qxd
3/9/2007
6:41 PM
Page 67
Can questions travel successfully?
Figure 3.3
Table 3.4 Round 1)
67
Importance, value and risk
MTMM experiment on Human Values Scale (Netherlands pilot,
Reliability of Importance Feeling Complete item
item 1 1 0.87 0.89 0.82
Validity of Importance Feeling Complete item
item2 2 0.84 0.93 0.91
1 0.94 0.89 0.76
item 1 1 0.99 0.99 0.99
item3 2 0.87 0.91 0.93
1 0.96 0.97 0.81
item2 2 0.92 0.97 0.98
1 0.99 0.99 0.99
2 0.86 0.88 0.97 item3
2 0.93 0.97 0.98
1 0.99 0.99 0.99
2 0.93 0.97 0.99
There is little indication from this analysis that either of the measures is better than, or systematically different from, the other. So as Table 3.5 shows, we then checked the correlations between the three items after correction for measurement error. Again, even without a formal test, it is clear that there is no substantive or relevant difference between these correlations. Although we now had very
Jowell-Chapter-03.qxd
3/9/2007
68
Table 3.5
6:41 PM
Page 68
MEASURING ATTITUDES CROSS-NATIONALLY
Correlations between the three items and their components
Correlations of the variables
Items only importance
1 with 2 2 with 3 1 with 3
Items only feelings
.74 .55 .50
.70 .52 .50
Items combinations .71 .49 .49
strong evidence of the equality of these measures, it could still be argued that the variables were different even though the intercorrelations between them were the same. Given the absence of any relevant method effect, we could directly test the equality of the measures by measuring whether the correlations between the importance and the feeling assertions for each separate item were in fact equal (or almost equal) to 1. We did this using the congeneric test model of Jöreskog (1971), which we show in Table 3.6. Table 3.6
Test of the equality of the importance and feeling variables
Number of the item 1 2 3
assumption corr=1 chi2 26 16.6 15.6
df 2 2 2
corr= free n 200 200 200
chi2 4 0.1 9.4
df 1 1 1
corr 0.85 0.91 0.95
Although the requirement that the correlation should be 1 is not formally met, its estimated value after correcting for measurement error is between .85 and .95, which was certainly high enough for our purposes. On the basis of these results the CCT concluded that it did not matter that two items were combined into single statements. So the Human Values Scale remained intact and is proving highly productive (see chapter 8). An evaluation of cross-cultural comparability In each round of the ESS we conduct six MTMM experiments in all countries, specifically to detect whether the quality of our measurement instruments is the same in different countries. These MTMM experiments also help us to evaluate the effects of different decisions during the questionnaire design phase on the quality of the instruments. When a sufficient number of these MTMM experiments have been conducted,
Jowell-Chapter-03.qxd
3/9/2007
6:41 PM
Page 69
Can questions travel successfully?
69
we might well be able to predict the quality of all questions before the data are collected. This point has not been reached, but the information is building up. Here we discuss only one experiment in detail, while giving an overview of the results of the six experiments conducted in Round 2. In the ESS questionnaire we generally employ answer categories with fixed reference points – that is, labels on an underlying scale, such as “extremely satisfied” or “extremely dissatisfied” as end points. If a less fixed reference point, such as “very dissatisfied” is used as the end of the scale, some respondents might regard them as end points and others might see them as intermediate positions – a difference of perception that may cause differences in responses which have little to do with differences in substantive opinions. So it is generally believed that fixed reference points have advantages, and we conducted an experiment during the main ESS fieldwork to test this assumption in all participating countries. At the same time we were able to test whether on a 0 – 10 numeric satisfaction scale the effect of a third fixed reference point in the scale – a labelled mid-point (“neither dissatisfied nor satisfied”) – would work better than just two. The topic for the experiments was a group of well-worn questions about people’s satisfaction levels with different aspects of society – the state of the economy, the way the government is doing its job, and the way democracy is working. The questions themselves are shown in the appendix for this chapter. All three statements were subjected to three separate experimental treatments – one with fixed end reference points (extremely dissatisfied and satisfied), one without fixed end reference points (very dissatisfied and satisfied), and the third with three fixed reference points (extremely satisfied and dissatisfied plus a middle fixed reference point (neither). The mean quality (total) across the three forms shows that the standard form with two fixed reference points is quite a bit better (.77) than the form without fixed reference points (.70) and the one with three reference points (.70). But how much did the data quality vary across the 17 countries we analysed? Table 3.7 shows the findings for each of the three questions in the main questionnaire. We see first of all that the question asking about satisfaction with the economy produces considerably lower quality data than do the other two. This is not the effect of just a few countries; in all countries the quality of the first question is lower than that of the other two questions. And, as Table 3.7 shows, the differences are large, always more than .2, which would in turn produce appreciable differences in correlations with other variables. These findings show that it is necessary to correct for data quality before comparing correlations across countries. Such corrections can be made using
Jowell-Chapter-03.qxd
3/9/2007
70
6:41 PM
Page 70
MEASURING ATTITUDES CROSS-NATIONALLY
Table 3.7 Country Aus Bel Cze Den Est Fin Ger Gre Lux Nor Pol Por Slo Spain Swe Swi UK Total
The quality of the three questions in different countries Economy
Government
,7921 ,7056 ,5997 ,6889 ,7921 ,6724 ,5219 ,7632 ,7225 ,7396 ,7569 ,8100 ,5858 ,5675 ,6235 ,7396 ,7744 ,6974
,8836 ,8464 ,6555 ,9025 ,9025 ,8281 ,7792 ,7964 ,7921 ,9801 ,9025 ,8464 ,7162 ,6688 ,7474 ,8100 ,8836 ,8201
Democracy ,8649 ,8649 ,6561 ,8100 ,8649 ,8281 ,8129 ,8138 ,9801 ,7569 ,8836 ,8281 ,6416 ,6688 ,6521 ,9409 ,8100 ,8046
the information contained in Table 3.7. For instance, take the observed correlation between the ‘economy’ item and the ‘government’ item (r12). In Estonia the correlation between these two variables is .659 and in Spain .487 – a large difference. Before we know whether this is substantively relevant, however, we must correct for measurement error, especially because – as we have seen – the quality of the measure is much higher in Estonia than in Spain. In Estonia the quality measure for these two items is .792 and .903 respectively, while in Spain it is .567 and .669. The correction for measurement error can be made by dividing the observed correlation by the product of the square root of the quality estimates for the two variables,11 or the correlation corrected for measurement error (ρ12): ρ12 = r12/ q2q1
(2)
Using this formula we obtain for Estonia a correlation corrected for measurement error of .78 and for Finland .79. These two correlations are both larger than the observed correlations, but the correction minimizes the difference in correlations between the two countries from nearly .2 to .01, suggesting the importance of these corrections. 11
Note that the quality estimates in the tables are q2. Therefore one has to get the square root of the quality estimates which are equal to the quality coefficients (q).
Jowell-Chapter-03.qxd
3/9/2007
6:41 PM
Page 71
Can questions travel successfully?
71
Just why the quality varies so much for different countries remains an open question. It may well have something to do with the translations or the formulation of the questions, which we know to be flawed. But it may also have something to do with the number of interviews per interviewer. We are still investigating this phenomenon. Conclusion We have described the attention we have given to evaluating the quality of single survey items as well as the link between these items and the ‘concepts by postulation’ that were originally proposed. In several instances this led to changes in the items. Whenever the CCT was uncertain about the quality of the new formulations, they were tested in the pilot study against alternative formulations. It was on this basis that the source questionnaire was formulated. But the source questionnaire was then translated from English into many different languages. Despite all the efforts we made to achieve functional equivalence for all questions, it is, of course, still likely that certain errors will be of different sizes in different languages. By including MTMM experiments into our procedures, we were able to estimate these different error variances, as well as the reliability and validity of some questions in the different languages. Our analyses show that there are indeed differences in error structures between certain languages and thus in the reliability and validity of certain questions by country. But when corrections for measurement error are made, this sometimes turns out to reduce the differences we had first found. But the corrections can of course also have the opposite effect, turning similar observed correlations into rather larger differences. Either way, our analyses show that conclusions drawn from the observed and corrected correlations are sometimes appreciably different. The fact is that it is risky to derive statistics from observed correlations in advance of corrections for measurement error. The ESS is unusual in its emphasis on estimating the reliability and validity of questions across countries, but – since only six experiments per round are possible – this has not yet been done for all questions. But more will be done as we go along to facilitate the correction of measurement error. And in time we hope to have enough readings to conduct a meta analysis which includes data from all countries so that a predictive program such as SQP can be developed for all languages represented in the ESS. Then the predictive program could be used to determine the size of the measurement errors in different countries and we can provide tools to correct for them. Data from different countries may then be compared without the obstacle of differential measurement error.
Jowell-Chapter-03.qxd
3/9/2007
72
6:41 PM
Page 72
MEASURING ATTITUDES CROSS-NATIONALLY
References Alwin, D.F. and Krosnick, J.A. (1991), ‘The reliability of survey attitude measurement: the influence of question and respondent attributes’, Sociological Methods and Research, 20, pp.139–181. Andrews, F.M. (1984), ‘Construct validity and error components of survey measures: a structural modelling approach’, Public Opinion Quarterly, 48, pp.409–422. Bagozzi R.P. and Yi, Y. (1991), ‘Multitrait-multimethod matrices in Consumer Research’, Journal of Consumer Research, 17, pp.426–439. Belson, W. (1981), The design and understanding of survey questions, London: Gower. Blalock H.M. Jr. (1968), ‘The measurement problem: A gap between languages of theory and research’ in: H.M. Blalock and A.B. Blalock (eds), Methodology in the Social Sciences, London: Sage. Blalock H.M. Jr. (1990), ‘Auxilary measurement theories revisited’, in: J.J. Hox and J. de Jong-Gierveld (eds), Operationalisation and research strategy. Amsterdam: Swets & Zeitlinger, pp.33–49. Campbell, D.T. and Fiske, D.W. (1959), ‘Convergent and discriminant validation by the multimethod-multitrait matrix’, Psychological Bulletin, 56, pp.833–853. Dijkstra, W. & van der Zouwen, J. (1982), Response behaviour in the survey-interview, London: Academic Press. Esposito J, Campanelli, P.C, Rothgeb, J. and Polivka, A.E. (1991), ‘Determining which questions are best: Methodologies for evaluating survey question’ in: Proceedings of the Survey Research Methods Section of the American Statistical Association (1991), pp.46–55. Esposito, J.P. and Rothgeb, J.M (1997), ‘Evaluating survey data: Making the transition from pretesting to quality assessment’ in: P. Lyberg, P. Biemer, L. Collins, E. de Leeuw, C. Dippo, N. Schwarz and D. Trewin (eds), Survey measurement and Process quality, New York: Wiley, pp.541–571. Fishbein, M. and Ajzen, I. (1975), Belief, Attitude, Intention and Behavior: An Introduction to Theory and Research, Reading, MA: Addison Wesley. Forsyth B.H., Lessler, J.T and Hubbard, M.L. (1992), ‘Cognitive evaluation of the questionnaire’ in: C.F. Tanur and R. Tourangeau (eds), Cognition and Survey research, New York: Wiley, pp.183–198. Graesser, A.C., Wiemer-Hastings, K., Kreuz, R. and Wiemer-Hastings, P. (2000a), ‘QUAID: A questionnaire evaluation aid for survey methodologists. Behavior Research Methods, Instruments, and Computers’ in: Proceedings of the Section on Survey Research Methods of the American Statistical Association, pp.459–464. Graesser, A.C., Wiemer-Hastings, K. Wiemer-Hastings, P. and Kreuz, R. (2000b), ‘The gold standard of question quality on surveys: Experts, computer tools, versus statistical indices’ in: Proceedings of the Section on Survey Research Methods of the American Statistical Association, pp.459–464.
Jowell-Chapter-03.qxd
3/9/2007
6:41 PM
Page 73
Can questions travel successfully?
73
Groot A.D. de, and Medendorp, F.L. (1986), Term, begrip, theorie: inleiding to signifische begripsanalyse, Meppel: Boom. Hox J.J. (1997), ‘From theoretical concept to survey questions’ in L. Lyberg, P. Biemer, M. Collins, E. de Leeuw, C. Dippo, N.Schwarz and D. Trewin (eds), Survey Measurement and Process Quality, New York: Wiley, pp.47–70. Jöreskog K.G. (1971), ‘Simultaneous factor analysis in several populations’, Psychometrika, 34, pp.409–426. Költringer, R. (1993), Gueltigkeit von Umfragedaten, Wien: Bohlau. Költringer, R. (1995), Measurement quality in Austria personal interview surveys’ in: W.E Saris and A.Münnich (eds), The Multitrait-Multimethod Approach to evaluate measurement instruments, Budapest: Eötvös University Press, pp.207–225. Krech D., Crutchfield R. and E. Ballachey (1962), Individual in Society, New York: McGraw-Hill. Molenaar, N.J. (1986), Formuleringseffecten in survey-interviews, Amsterdam: VUuitgeverij. Northrop F.S.C. (1947), The Logic of the Sciences and the Humanities, New York: World Publishing Company. Presser S. and Blair, J. (1994), ‘Survey Pretesting: Do different methods produce different results?’ in: P.V. Marsden (ed), Sociological Methodology, Oxford: Basil Blackwell, pp.73–104. Saris, W.E. and Gallhofer, I.N (2005), A scientific method for questionnaire design: SQP, Amsterdam: SRF. Saris W.E. and Gallhofer, I.N. (forthcoming), Design, evaluation and analysis of questionnaires for survey research, Hoboken: Wiley. Saris, W.E., Krosnick, J.A. and Shaeffer, E.M. (Unpublished, 2003), ‘Comparing the quality of agree/disagree questions and balanced forced choice questions via an MTMM experiment’, Paper presented at the Midwestern Psychological Association Annual Meeting, Chicago, Illinois. Saris W.E., Satorra, A. and Coenders, G. (2004b), ‘A new approach for evaluating quality of measurement instruments’, Sociological Methodology, pp. 311–347. Saris W.E., van der Veld, W. and Gallhofer, I.N. (2004a), ‘Development and improvement of questionnaires using predictions of reliability and validity’ in: S. Presser, M.P. Couper, J.T. Lessler, E. Martin, J. Martin, J.M. Rothgeb and E. Singer (eds), Methods for testing and evaluating survey questionnaires Hoboken: Wiley, pp. 275–299. Scherpenzeel A. and Saris, W.E (1997), ‘The validity and reliability of survey questions: A meta analysis of MTMM studies’, Sociological Methods and Research, 25, pp.341–383. Schuman, H. and Presser, S. (1981), Questions and answers in attitude surveys: experiments on question form, wording and context, New York: Academic Press. Schwartz S.H. (1997), ‘Values and culture’ in: D. Muno, S. Carr and J. Schumaker (eds), Motivation and culture, New York: Routledge, pp.69–84. Snijkers G. (2002), Cognitive laboratory experiences: on pretesting, computerized questionnaires and data quality, PhD thesis, University of Utrecht.
Jowell-Chapter-03.qxd
74
3/9/2007
6:41 PM
Page 74
MEASURING ATTITUDES CROSS-NATIONALLY
Sudman S., and Bradburn, N. (1982), Asking questions: A practical guide to questionnaire design, San Francisco: Jossey-Bass. Sudman S., Bradburn, N. and Schwarz, N. (1996), Thinking about answers: The Application of Cognitive Processes to Survey Methodology, San Francisco: Jossey-Bass. van der Veld, W.M., Saris, W.E. and Gallhofer, I. (September, 2000), ‘SQP: A program for prediction of the quality of Survey questions’, Paper presented at the ISA – methodology conference, Köln. van der Zouwen, J. (2000), ‘An assesment of the difficulty of questions used in the ISSP questionnaires, the clarity of their wording and the comparability of the responses’, ZA-information, 45, pp.96–114. Vetter, A.(1997), ‘Political Efficacy: Alte und neue Meßmodelle im Vergleich’, Kölner Zeitschrift für Soziologie und Sozialpsychologie, 49, pp.53–73. Wothke W. (1996), ‘Models for multitrait-multimethod matrix analysis’ in: G.C. Marcoulides and R.E. Schumacker (eds), Advanced structural equation modelling: Issues and techniques, Mahwah, N.J: Lawrence Erlbaum, pp.7–56.
Jowell-Chapter-03.qxd
3/9/2007
6:41 PM
Page 75
Can questions travel successfully?
75
Appendix: The ESS Satisfaction questions from Round 2. Form as in the main questionnaire B25 STILL CARD 13: On the whole how satisfied are you with the present state of the economy in [country]? Still use this card. Extremely dissatisfied 00
01
Extremely (Don’t satisfied know) 02 03
04
05
06
07
08
09
10
88
B26 STILL CARD 13: Now thinking about the [country] government, how satisfied are you with the way it is doing its job? Still use this card. Extremely dissatisfied 00
01
Extremely (Don’t satisfied know) 02 03
04
05
06
07
08
09
10
88
B27 STILL CARD 13: And on the whole, how satisfied are you with the way democracy works in [country]? Still use this card. Extremely dissatisfied 00
01
Extremely (Don’t satisfied know) 02 03
04
05
06
07
08
09
10
88
Second Form B25 STILL CARD 13: On the whole how satisfied are you with the present state of the economy in [country]? Still use this card. Very dissatisfied 00
01
Very satisfied 02
03
04
05
06
07
08
09
10
Jowell-Chapter-03.qxd
3/9/2007
76
6:41 PM
Page 76
MEASURING ATTITUDES CROSS-NATIONALLY
B26 STILL CARD 13: Now thinking about the [country] government, how satisfied are you with the way it is doing its job? Still use this card. Very dissatisfied 00
01
Very satisfied 02
03
04
05
06
07
08
09
10
B27 STILL CARD 13: And on the whole, how satisfied are you with the way democracy works in [country]? Still use this card. Very dissatisfied 00
01
Very satisfied 02
03
04
05
06
07
08
09
10
Third Form B25 STILL CARD 13: On the whole how satisfied are you with the present state of the economy in [country]? Still use this card. Extremely dissatisfied
00
01
neither satisfied nor dissatisfied 02
03
04
05
Extremely satisfied
06
07
08
09
10
B26 STILL CARD 13: Now thinking about the [country] government, how satisfied are you with the way it is doing its job? Still use this card. Extremely dissatisfied
00
01
neither satisfied nor dissatisfied 02
03
04
05
Extremely satisfied
06
07
08
09
10
Jowell-Chapter-03.qxd
3/9/2007
6:41 PM
Page 77
Can questions travel successfully?
77
B27 STILL CARD 13: And on the whole, how satisfied are you with the way democracy works in [country]? Still use this card. Extremely dissatisfied
00
01
neither satisfied nor dissatisfied 02
03
04
05
Extremely satisfied
06
07
08
09
10
Jowell-Chapter-03.qxd
3/9/2007
6:41 PM
Page 78
Jowell-Chapter-04.qxd
4
3/9/2007
6:42 PM
Page 79
Improving the comparability of translations Janet A. Harkness∗
Introduction When a survey is conducted in multiple languages, the quality of questionnaire translations is a key factor in determining the comparability of the data collected. Conversely, poor translations of survey instruments have been identified as frequent and potentially major sources of survey measurement error. As the ESS questionnaires consist of more than 250 substantive and background questions, around one half of which are new at each round, it is hardly surprising that considerable time, resources and methodological effort have been devoted to developing rigorous procedures for translation and translation assessment. This chapter describes the general approach and individual procedures adopted by the ESS to enhance the quality of translations. In line with recent developments in survey translation practice, the ESS has replaced more traditional survey approaches involving back translation with the team-based approach described and assessed here.
∗
Janet Harkness is a Senior Research Scientist at ZUMA, Mannheim, Germany and Director of the Survey Research and Methodology Program at the University of Nebraska, USA, where she holds the Donald and Shirley Clifton Chair in Survey Science.
Jowell-Chapter-04.qxd
80
3/9/2007
6:42 PM
Page 80
MEASURING ATTITUDES CROSS-NATIONALLY
Source and target languages Following terminology from translation science, we distinguish here between ‘source’ languages and ‘target’ languages and thus between source language questionnaires and target language questionnaires. While a source language is the language from which a translation is made, a target language is the language into which translation is made. In the case of the ESS, the source language is (British) English and all translations into other languages are required to be made from the original British source documents. Multi-national, multi-lingual surveys can follow either of two broad strategies for designing instruments. The first is to try to collect equivalent data by an ‘ask the same questions’ (ASQ) approach to instrument design. The second is to try to collect comparable data by an ‘ask different questions’ (ASD) approach. In ASQ approaches, the various language versions of the questionnaire are produced on the basis of translation of a source questionnaire. In ADQ approaches, construct or conceptual overlap of the questions used in each context is the basis of comparability, not translation. The ESS, in line with the majority of cross-national surveys, has essentially adopted what may be referred to as a sequential ASQ approach – that is, we finalise the source questionnaire before embarking on other language versions. Two further ASQ models – the simultaneous approach and the parallel approach – cannot be discussed here but see Harkness (forthcoming) for details. Sequential models have certain advantages which undoubtedly have contributed to their popularity in cross-national research. For instance, a sequential ASQ approach is relatively economical and straightforward to organise. More important still, it permits an unlimited number of target language versions to be produced and allows projects to aim at replicating existing questions. In contrast to ADQ approaches, it offers analysts the chance to make item-for-item comparisons across data sets (cf. van de Vijver, 2003). At the same time, an ASQ sequential approach focuses less on cross-cultural input at the instrument design stage than other models, as discussed in Harkness (forthcoming). Sequential ASQ models operate on the underlying assumption that functionally equivalent questions can be produced for different cultures and languages by translating the semantic content of the source questions. Their success depends both on the suitability of source question content and formulation and on the quality of the translations made. In saying this, it is important to remember that contextual considerations contribute to determining what respondents perceive questions to mean (see, for example, Schober and Conrad, 2002; Harkness, 2004; Braun and Harkness, 2005). As a result, semantic equivalence of words or questions is, of itself, not a sufficient guarantee of functional equivalence. At the same time, the considerable attention paid to the selection and formulation of source questionnaire items in sequential approaches often means that semantic equivalence (i.e., meaning at
Jowell-Chapter-04.qxd
3/9/2007
6:42 PM
Page 81
Improving the comparability of translations
81
face value) across languages is assumed to be a strong indicator of an accompanying underlying conceptual equivalence. Apart from the ESS, prominent multi-national surveys that follow a sequential ASQ model include: • The International Social Survey Programme (an annual academic social science survey that covers around 40 countries: http://www.issp.org) • The European and World Values surveys • The family of ‘Barometer’ surveys in Eastern Europe, Asia, Africa and Latin America • the Survey on Health, Ageing and Retirement in Europe (SHARE) • the WHO World Mental Health Initiative. Like many other international studies, the ESS includes questions that have been used in other studies. When questions that were designed for one study are replicated in another, problems may arise in respect of equivalence, translation and adaptation (Harkness, 2004; Harkness et al, 2004). The procedure followed in the ESS is to invite comments and contributions from all participating countries in each round on the draft source questions. The substantive questions in the ESS are accompanied by a prescribed set of socio-demographic ‘background’ questions. Some of these background questions can be translated in the same way as substantive questions, while others, such as education, require country-specific content and formulation. Organisation and specification
Organisation As noted in chapter 1, the translation arrangements and their coordination are the primary responsibility of a team within ZUMA, one of the ESS partner institutions, aided by a Translation Expert Panel (see chapter 1). The task of annotating the source questions and of responding to queries from participating countries is shared between ZUMA and the ESS coordinating office at City University.
Specification One of the ‘rules’ governing ESS translation practices is derived from the original Blueprint document (ESF, 1999), which stipulated that translations should be made into any language used as a first language by five per cent or more of a country’s population. As a result, nine countries have been required to translate their questionnaires into more than one language: two languages each in Belgium, Estonia, Finland, Slovakia, Spain and Ukraine, three
Jowell-Chapter-04.qxd
82
3/9/2007
6:42 PM
Page 82
MEASURING ATTITUDES CROSS-NATIONALLY
languages each in Israel and Switzerland, and four languages in Luxembourg (see later section of this chapter for a more detailed listing). The detailed arrangements and recommendations for translation were developed by the head of the translation team at ZUMA, in consultation with the Translation Expert Panel. Each participating country is required not only to reserve an appropriate budget for their translations and pre-testing according to the centrally provided specification but also to assume responsibility for the quality of the translations it ultimately produces. The specification produced in Round 1 has been somewhat modified in subsequent rounds, but remains substantially intact (for the latest version see www.europeansocialsurvey.org). The ESS translation guidelines and support materials are designed to provide National Coordinators with detailed guidance on translation and translation assessment procedures. The key points covered are as follows: • Countries are required to use two translators for each language they employ and to adopt a team approach to the process (TRAPD – described in detail below). Details of the approach identify the range of skills that each team needs to have and the procedures to be followed in producing translations and evaluating and revising the outcomes. Revisions are based on team discussions of the draft translations in direct comparison with the English-language source questionnaire. This is then followed by pre-testing of the questionnaire and examination of pre-test findings. • As noted above, the questionnaire and supporting field documents must be translated into any language spoken as a first language by five per cent or more of a country’s population. • As numerous questions may be replicated in future rounds, it is important to decide on formulations/translations that are likely to stand the test of time. Wording changes between rounds are to be avoided wherever possible. • Countries that share languages are required to consult one another on their versions in order to reduce unnecessary differences between same language versions. • Documentation of the translation process and translation products is required in order to facilitate replication in future rounds as well as comparisons within a round of different national versions in the same language. • All countries are required to take into account the literacy level of their population. Since the ESS translation requirements are detailed, concrete and innovative, countries participating for the first time might not be familiar with the kind of procedures proposed. The translation guidelines thus aim to be both detailed and accessible for all participating and a help and query desk is also part of the support service provided by the ZUMA team. Participants in Round 1 were provided with sets of guidelines, descriptions and examples at a series of
Jowell-Chapter-04.qxd
3/9/2007
6:42 PM
Page 83
Improving the comparability of translations
83
National Coordinators’ Meetings. New entrants to the ESS in later rounds are provided with an overview that covers these materials and are given the opportunity to consult one-to-one with ESS translation specialists. The Translation Procedure: TRAPD An acronym for Translation, Review, Adjudication, Pre-testing and Documentation, TRAPD consists of five interrelated procedures (Harkness, 2003). These form the basis of the team approach to translation developed for the ESS. The procedures are open to iteration so that – as adjustments are made to draft versions of translated questions – the review and documentation activities may also need to be repeated. The three key roles involved in the translation effort are those of translator, reviewer and adjudicator. Translators need to be skilled practitioners who have ideally received training in the translation of questionnaires. The ESS calls for two translators per questionnaire, each of whom is required to translate out of English into their strongest language (normally their ‘first’ language). Having translated independently from each other, they then take part in the subsequent review session(s). The notes they make on their translations provide valuable information for the review session. Reviewers need not only to have good translation skills but also to be familiar with the principles of questionnaire design and the particular study design and the topics covered. A single reviewer with linguistic expertise, experience in translating, and survey knowledge is sufficient. But if one person cannot be found with all these skills, then more than one may be used to cover the different aspects. Reviewers do not produce either of the two draft translations but attend the review sessions and contribute to and guide the revisions. Once again, notes from the review may inform any subsequent adjudication. The adjudicator is the person responsible for the final decisions about which translation options to adopt. This is often the National Coordinator but may also be another nominated person in the team. In any event, adjudicators must have knowledge of the research topics and survey design more generally, as well as being proficient in the languages under discussion. Especially when there are multiple languages involved, the adjudicator – as the person responsible for signing off the translations – may not be sufficiently proficient in one or other of the languages. In those cases, the adjudicator is required to work closely with the reviewer or a further suitable consultant to complete the task. The adjudicator ideally attends the review and contributes to revisions. Failing this, adjudication takes place at a later independent meeting with the reviewer and any other relevant experts. This multi-stage approach was adopted for three main reasons: to mitigate the subjective nature of translation and text-based translation assessment; to ensure appropriate stage-by-stage documentation which helps both adjudicators and
Jowell-Chapter-04.qxd
84
3/9/2007
6:42 PM
Page 84
MEASURING ATTITUDES CROSS-NATIONALLY
subsequent analysts; and to allow comparisons of translations by countries which share a language. The decision taken to secure input from two translators was motivated by a growing body of research that points to the advantages of discussing translations in a team (e.g., Guillemin et al., 1993; Acquadro et al., 1996; McKay et al., 1996; Harkness et al., 2004). By giving the team two draft versions to discuss, more options are automatically available for discussion and one translator is not called on to ‘defend’ her/his product against the rest of the team. In addition, regional variance, idiosyncratic interpretations, and inevitable translator oversights can be better dealt with (Harkness and Schoua-Glusberg, 1998). Moreover, team approaches enable closer appraisals and more detailed revisions than do methods that use a single translator and compartmentalised review procedures. The team approach specified for the ESS ensures that people with different and necessary fields of expertise are involved in the review and adjudication process. Properly conducted by a well-selected and well-briefed team, the increased benefits of a team approach are considerable. Even so, the procedures themselves are no automatic guarantee of quality. As noted, the TRAPD strategy incorporates review into translation production. This approach was selected in preference to the more traditional model of translation followed by back translation. A major drawback of back translation as a review process is that it ultimately focuses attention on two versions of the source questionnaire rather than concentrating appropriate attention on the translation in the target language. For detailed discussion of back translation and its inherent weaknesses, see Harkness (2003) and Harkness (forthcoming); see, too, Brislin (1970, 1980, 1986), Hambleton (2005), Pan and de la Puente (2005) and McKenna et al.(2005). Split and parallel translations As noted, the first stage of our procedures involves two translators per language working independently of each other to produce two parallel translations. At a subsequent reconciliation meeting, the translators and a reviewer go through the questionnaire question by question, discussing differences and, if possible, coming to a consensus on a single new version. If the adjudicator is also present, a final version may be reached in one sitting. Otherwise, any unresolved differences go forward to an adjudication meeting. However, ESS countries that plan to discuss their translations with other countries producing translations in the same language are permitted to produce only one translation. At the same time, the fact that a country intends to discuss its version with another country does not imply that the translation produced within one country is less important, nor that it needs
Jowell-Chapter-04.qxd
3/9/2007
6:42 PM
Page 85
Improving the comparability of translations
85
to be undertaken with less care. Countries are encouraged to plan ahead so as to schedule sharing activities for the new ‘rotating’ modules at each round. In producing their single country version, each country is still required to use two translators. The source questionnaire is split up between these two in the alternating fashion used to deal cards in many card games, thus ensuring each translator gets an even spread of the questionnaire to translate (SchouaGlusberg, 1992). The two halves of the questionnaire are then merged for the review discussion, during which the translators, the reviewer and (usually) the adjudicator go through the translation question by question, discussing alternatives and agreeing on a single version. Once these national versions are produced, the countries sharing languages arrange to discuss their different versions with a view to harmonising them wherever appropriate. In discussions between two countries, therefore, two versions of the questionnaire are available, one produced by one country, one produced by the other. The precise steps recommended for sharing are described in documentation available on the ESS website. In recommending that countries who share a language should co-operate on reducing differences across their questionnaires, the ESS chose to encourage harmonisation where appropriate but to refrain from a strict policy of enforced harmonisation by which countries sharing a language would be required to use identical question wording. Countries are encouraged to plan ahead so as to schedule sharing activities for the new ‘rotating’ modules at each round. Countries with more than one language
Producing multiple translations With the exception of Ireland and the UK, all ESS countries need to translate into at least one language and many had to field the survey in multiple languages so as to meet the five per cent rule referred to earlier. Translation demands on individual countries can be very different. For example, in Round 1, Switzerland and Israel both produced three written translations. Whereas Switzerland shared each language with at least one other country and could theoretically consult with these, Israel was the only country in that round to translate into Hebrew, Arabic and Russian. Luxembourg also fielded the questionnaire in several languages in the first round but, contrary to ESS requirements, did not produce a written translation in that round for each language in which interviews were conducted. Table 4.1 indicates that for Round 2, nine languages were shared and that 16 of the countries listed could, in principle, have shared the development of at least one translation with another country or countries.
Jowell-Chapter-04.qxd
3/9/2007
86
Table 4.1
Countries translating and/or sharing languages in Rounds 1 and 2
Austria Belgium Czech Republic Denmark Estonia (2) Finland France Germany Greece Hungary Iceland Ireland Israel (2) Italy Luxembourg (3)
Slovenia Spain Sweden Switzerland Turkey UK Ukraine (2)
Page 86
MEASURING ATTITUDES CROSS-NATIONALLY
Country
Netherlands Norway Poland Portugal Slovakia
6:42 PM
Language(s) translated German Flemish(1) French Czech Danish Estonian Russian Swedish Finnish French German Greek Hungarian Icelandic – Russian Arabic Italian French German Dutch (1) Norwegian Polish Portuguese Slovakian Hungarian Slovene Catalan Spanish Swedish Swiss German Italian Turkish – Russian Ukrainian
Language in common with
Round(s)
Germany, Luxembourg, Switzerland The Netherlands, France, Switzerland, Luxembourg
1,2 1,2
Israel (2), Ukraine
1,2 1,2 2
Sweden
1,2
Switzerland, Luxembourg, Belgium Luxembourg, Switzerland, Austria
1,2 1,2 2 1,2 2 1,2 1
Slovak Republic UK, Luxembourg Estonia, Ukraine Switzerland France, Switzerland, Belgium Germany, Austria, Switzerland Belgium
Luxembourg Hungary, Ukraine
Finland Germany, Austria, Luxembourg Italy Ireland, Luxembourg Estonia, Israel
1,2 1,2 1,2 1,2 1,2 1,2 2
1,2 1,2 1,2 1,2 2 1,2 2 2
Notes: (1) Written Dutch and Belgian Flemish are very similar; (2) Israel did not participate in Round 2, Ukraine and Estonia did not participate in Round 1; (3) Luxembourg did not have written translations for some languages in Round 1 and it also fielded some questionnaires in English
Jowell-Chapter-04.qxd
3/9/2007
6:42 PM
Page 87
Improving the comparability of translations
87
Sharing languages and harmonisation The ESS countries that produced a questionnaire in the same language as another country in Round 2 are set out in Table 4.2. We have treated Dutch and Belgian Flemish as close enough to be able to be classified in this way. Table 4.2 Round 2 countries translating questionnaires into the same languages Shared language
Countries
Dutch/Flemish French German Italian Hungarian Russian Swedish
Belgium, The Netherlands Belgium, France, Switzerland, Luxembourg Austria, Germany, Switzerland, Luxembourg Italy, Switzerland Hungary, Slovakia Ukraine, Estonia Sweden, Finland
Ancillary measures to support translation A number of ancillary measures have been developed to facilitate ESS translation efforts.
Annotating the source questionnaire Survey questionnaires tend to look rather simple, but they are, instead, complex measurement tools. One reflection of this is the fact that interviewer manuals often provide interviewers with definitions and clarifications of terms as they are intended in a given questionnaire. For instance, a household composition question such as ‘How many people, including children, live in this household?’ is often accompanied in the interviewer manual by definitions of what counts as a ‘household’, what counts as ‘live in’, and reminders about various categories of people who may stay for different periods and on different financial bases in the dwelling unit (for example, boarders, lodgers, servants, visiting relatives). This information thus clarifies which categories of people are to be counted and guides interviewers on how to answer respondent queries about what the question means. Annotations in the source questionnaire for translators and others involved in producing new versions serve a similar purpose. They provide information on what the question is intended to measure. For instance, ‘household’ in the question above might be automatically associated with ‘home’ and hence with ‘family’ in some cultures. In measurement terms, however, the household
Jowell-Chapter-04.qxd
88
3/9/2007
6:42 PM
Page 88
MEASURING ATTITUDES CROSS-NATIONALLY
composition question is intended to refer to a dwelling unit. A note for those producing other language versions of this question could guide them on the required concept, however differently this might then be realised in different countries. Annotations are not intended to be incorporated literally into translated questions, nor provided to interviewers as notes. They are simply to be used as aids to the design of functionally equivalent questions.
Query hotline and FAQs ESS participants are encouraged to contact the translation team based at ZUMA about any difficulties they encounter during translation. By the second round of the ESS, we were thus able to compile a list of frequently asked questions and the appropriate replies. In the course of doing so, it has become evident that many of the queries have at least as much to do with measurement properties of the source questionnaire as with translation issues per se.
Documentation templates In the first two rounds, the ESS questions were delivered to participating countries in two formats. On the one hand, a formatted paper-and-pencil source questionnaire was distributed as a Word file. On the other, the questionnaire was sent out in a Word table template that assigned each question, instruction, and set of response categories to individual rows, and provided empty columns for the two envisaged translations and the translation and review comments that would inform the review process and later document the translation output. In social science survey projects, documentation of translation decisions and difficulties is, to date, rare. ESS countries were asked to document translation and review decisions using the template provided for six main reasons: • The TRAPD approach is based on discussion and revision of drafts to arrive at final versions. Having a record of the points felt to be at issue in the forms of notes or comments from translators can greatly facilitate the review discussion. Note-taking is a tool that trained translators often use to accelerate translation and revision and it helps them recall the rationale for particular decisions. • Countries sharing languages need a record of the individual decisions and their rationale in order to compare and contrast their alternative versions. • Later rounds of the survey will be informed by good records of the initial problems encountered and their solutions. If changes are made in
Jowell-Chapter-04.qxd
3/9/2007
6:42 PM
Page 89
Improving the comparability of translations
89
either source questions or translations over time, records need to be available of the chain of changes across translations (version tracking and documentation). • New members joining the programme that share languages with countries already involved will have the documentation available to them. • Data analysts will be able to consult such documents as part of their efforts to interpret the data. • The template was a means of encouraging teams to take notes alongside points at issue as they worked. Experience has shown that if writing up problems or solutions is delayed, many details will be forgotten. Lessons learned The development and refinement of ESS translation guidelines and support materials has been a learning experience for those involved. Insights gained are summarised briefly below.
Source questionnaire and translation Although participating countries are given every chance to comment on draft modules and encouraged to help question drafting teams resolve cultural and linguistic problems at an early stage, most problems are either not noticed or neglected until the questions come under close scrutiny when the translation proper begins. The advance translation procedures described below could help remedy this.
Advance translation While national teams are asked not to begin their official translations before the source questionnaire for each round has been finalised, we do suggest that they use even rough-and-ready translations as a problem spotting tool (see discussion in Harkness, 1995, 2003; Harkness and Schoua-Glusberg, 1998; Harkness et al., 2004; Braun and Harkness, 2005). The expectation is that by jotting down a first translation of each question during their appraisal of the draft source questionnaire, participating countries can help identify problems before the source questionnaire is finalised, at a time when wording can still be more readily changed. The dynamic process of ESS question design can now be monitored ‘live’ on the web and National Co-ordinators will be encouraged in future to try this technique of advance translation in appraising the draft source questions.
Jowell-Chapter-04.qxd
3/9/2007
90
6:42 PM
Page 90
MEASURING ATTITUDES CROSS-NATIONALLY
Templates and production tools It is important to minimise changes to the source documents once translation has begun, but changes cannot be completely avoided even at this stage, because they often arise from late feedback. Alterations to ‘finalised’ source documents were much reduced in Round 2 compared to Round 1, so perhaps we are coming to grips with this problem. Even so, better tools that help national teams to keep up to date with changes and allow them to continue to use a template aligning source questions and translations would clearly be of great benefit. The infrastructure grant that the ESS has recently received will enable us to produce a blueprint for such tools by the end of 2007. Meanwhile, checklists can be used to ensure that common production errors (such as the reversal of response options, or omissions) are swiftly spotted and remedied during the translation or review process.
Attention to detail Initial evaluations of some of the ESS translations from the first two rounds suggest that countries sharing languages would benefit from across-country discussion of the draft questionnaires. Efforts will be made to enhance such collaboration in coming rounds. Question-by-question review by a team of specialists, as recommended in the ESS specification, greatly increases the likelihood of finding and remedying major translation mistakes as well as subtler errors. It is clear, however, that some countries have been more successful than others in meticulously implementing the ESS procedures. Certainly, mistakes found in the first two rounds suggest that some countries had failed to use the specified team approach effectively. Countries also need to remember to carry out normal copy-editing and proof-reading of their translations to ensure that nothing has been omitted or incorrectly placed in the course of the translation effort. The copy-editing review should check both the correctness of the translated text as text and its completeness. Copy-editors also need to check back against the source questionnaire to be able to identify inadvertent omissions or reversals. Questions that countries have raised with the head of the translation team at ZUMA reflect the difficulty translators sometimes have in understanding the purpose of survey translations. Since techniques and strategies that translators find useful for other kinds of texts are often not appropriate for survey translations, even very good translators need to be thoroughly
Jowell-Chapter-04.qxd
3/9/2007
6:42 PM
Page 91
Improving the comparability of translations
91
briefed on the special nature of survey translation. It may take some concentrated effort before translators and possibly their supervisors can be brought to appreciate this.
Identifying translation errors Errors in translation have a debilitating effect on data comparability. An important and evolving part of the quality control work performed by the CCT includes identifying survey error related to translations. For this, a combined approach is necessary (Harkness et al., 2004): on the one hand, the quality of translations needs to be monitored by people knowledgeable about the languages involved and the characteristic properties of questionnaires. On the other hand, a variety of statistical procedures can be employed to check various sources of potential error in the data, of which translation errors may be one. Billiet (2006) provides an example, based on ESS questions in French. Conclusion For a variety of reasons, good survey translation is far more difficult than it may appear. Without an understanding of the principles of questionnaire design and survey measurement, even skilful translators will not be able to produce the kind of document required. By bringing together translators and survey specialists, the team approach adopted by the ESS goes a good way towards recognising this. At the same time, research on ESS translation indicates that translators would benefit from more briefing and training on survey translations and that translation supervisors might need to be better informed about the risks of corner-cutting. Poor translations deprive researchers of the opportunity to present respondents with the questions they intended. The ESS has invested unusual effort and expense in attempting to develop a practical, theoretically sound and rigorous framework for translation. We are aware that challenges and problems remain and continuing efforts are being made to iron out remaining difficulties. The recent evaluation of ESS translation outputs so as to inform new strategies is one example. Our intention is to introduce ‘review-andlearn’ projects in future rounds that will contribute to our common understanding of what needs to be done and how it should best be implemented on the ground.
Jowell-Chapter-04.qxd
3/9/2007
92
6:42 PM
Page 92
MEASURING ATTITUDES CROSS-NATIONALLY
References Acquadro, C., Jambon, B., Ellis, D. and Marquis, P. (1996), ‘Language and Translation Issues’ in: B. Spilker (ed.), Quality Life and Pharmacoeconomics in Clinical Trials, 2nd edition, Philadelphia: Lippincott-Raven. Billiet, J (2006), ‘Things that go wrong in Comparative Surveys – evidence from the ESS’, Paper presented at the ESRC Methods Festival, Oxford University, 20 July 2006. Braun, M. and Harkness, J.A. (2005), ‘Text and Context: Challenges to Comparability of Survey Questions’ in: J.H. Hoffmeyer-Zlotnik and J.A. Harkness (eds), ZUMA-Nachrichten Spezial No.11. Methodological Aspects of CrossNational Research, Mannheim: ZUMA. Brislin, R.W. (1970), Back-translation for cross-cultural research,. Journal of CrossCultural Psychology, 1 (3), pp.185–216. Brislin, R.W. (1980), ‘Translation and Content Analysis of Oral and Written Materials’ in: H.C. Triandis and J.W. Berry (eds), Handbook of cross-cultural Psychology, Boston: Allyn & Bacon. Brislin, R.W. (1986), ‘The wording and translation of research instruments’ in: W.J. Lonner and J.W. Berry (eds), Field methods in cross-cultural research. Beverly Hills, CA: Sage. de Mello Alves, M.G., Chor, D., Faerstein, E., de S Lopes, C. and Guilherme L. (2004), ‘Short version of the “job stress scale”: a Portuguese-language Adaptation’, Rev Saúde Pública, 38 (2), pp.164–171. ESF (European Science Foundation) (1999), Blueprint for a European Social Survey, Strasbourg: ESF. Guillemin, F., Bombardier, C. and Beaton, D. (1993), ‘Cross-Cultural Adaptation of Health-Related Quality of Life Measures: Literature Review and Proposed Guidelines’, Journal of Clinical Epidemiology, 46 (12), pp.1417–1432. Hambleton, R.K. (2005), ‘Issues, Designs, and Technical Guidelines for Adapting Tests in Multiple Languages and Cultures’ in: R.K. Hambleton, P. Merenda and C.D. Spielberger (eds), Adapting Educational and Psychological Tests for CrossCultural Assessment, Hillsdale: Erlbaum. Harkness, J.A. (1995), ISSP Methodology Translation Work Group Report 1995, Report to the ISSP General Assembly at the 1995 Cologne ISSP meeting. Harkness, J.A. (2003), ‘Questionnaire Translation’ in: J.A. Harkness, F. van de Vijver and P. Mohler (eds), Cross-Cultural Survey Methods, New York: John Wiley and Sons. Harkness, J.A. (2004), ‘Overview of Problems in Establishing Conceptually Equivalent Health Definitions across Multiple Cultural Groups’ in: S.B. Cohen, and J.M. Lepkowski (eds), Eighth Conference on Health Survey Research Methods, Hyattsville: US Department of Health and Human Services, pp. 85–90. Harkness, J.A., (forthcoming), ‘Comparative Survey Research: Goals and Challenges’ in: J. Hox, E.D. de Leeuw and D. Dillman (eds), International Handbook of Survey Methodology, and Mahwah: Lawrence Erlbaum.
Jowell-Chapter-04.qxd
3/9/2007
6:42 PM
Page 93
Improving the comparability of translations
93
Harkness, J.A., Pennell, B.E. and Schoua-Glusberg, A. (2004), ‘Survey Questionnaire Translation and Assessment’ in: S. Presser, J.M. Rothgeb, M.P. Couper, J.T. Lessler, E. Martin, J. Martin and E. Singer (eds), Methods for Testing and Evaluating Survey Questionnaires, Hoboken: John Wiley and Sons. Harkness, J.A. and Schoua-Glusberg, A. (1998), ‘Questionnaires in Translation’ in: J.A. Harkness (ed.), Cross-Cultural Survey Equivalence ZUMA-Nachrichten Spezial No.3, Mannheim: ZUMA. McKay, R.B., Breslow, M.J., Sangster, R.L., Gabbard, S.M., Reynolds, R.W., Nakamoto, J.M. and Tarnai, J. (1996), ‘Translating Survey Questionnaires: Lessons Learned’, New Directions for Evaluation, 70, pp.93–105. Pan, Y. and de la Puente, M. (2005), Census Bureau Guidelines for the Translation of Data Collection Instruments and Supporting Materials: Documentation on How the Guideline was Developed, Statistical Research Division, U.S. Census Bureau. Schober, M.F. and Conrad, F.G. (2002), ‘A collaborative view of Standardized Survey Interviews’ in: D.W. Maynard, H. Hootkoop-Steenstra, N.C. Schaeffer and J. Van der Zouwen (eds), Standardization and Tacit Knowledge Interaction and Practice in the Survey Interview, New York: John Wiley and Sons. Schoua-Glusberg, A. (1992), Report on the Translation of the Questionnaire for the National Treatment Improvement Evaluation Study, Chicago: National Opinion Research Centre. van de Vijver, F.J.R. (2003), ‘Bias and Equivalence: Cross-cultural Perspectives’ in: J.A. Harkness, F.J.R. van de Vijver and P. Mohler (eds), Cross-cultural Survey Methods, New York: John Wiley and Sons.
Jowell-Chapter-04.qxd
3/9/2007
6:42 PM
Page 94
Jowell-Chapter-05.qxd
5
3/9/2007
6:42 PM
Page 95
If it bleeds, it leads: the impact of media-reported events Ineke Stoop
∗
Introduction1 In November 2002, the tanker Prestige broke up off the west coast of Spain, causing what many predicted would be the most damaging oil spill since the Exxon Valdez broke up off the coast of Alaska in 1989. Several towns and beaches along the Galician coast, which earned its nickname ‘coast of death’ for the many shipwrecks near its shores, were fouled by oil. In Madrid, environmental activists held demonstrations against the oil spill and in Santiago de Compostela, social groups and politicians from Spain’s opposition parties led a march under the battle charge of nunca mais (‘never again’). According to the organisers and local police, 150,000 people marched in protest against the inadequate response of the regional and national governments in handling the crisis. At about the same time, fieldwork for the first round of the European Social Survey was starting in Spain. The damaging oil spill, the perceived inadequacy of the government to deal with it and the large protest rallies, which received widespread media attention all over Europe, may well have had an impact on the answers of Spanish citizens to ESS questions about trust in government and trust in politicians. Any differences found between ∗
Ineke Stoop is head of the department of Data Services and IT at the Social and Cultural Planning Office of the Netherlands. 1 A number of ideas in this chapter on the use of news sources and ways to improve event reporting stem from discussions with and papers from Howard Tumber (City University London), Paul Statham (University of Bristol) and David Morrison (University of Leeds).
Jowell-Chapter-05.qxd
96
3/9/2007
6:42 PM
Page 96
MEASURING ATTITUDES CROSS-NATIONALLY
levels of trust in government in Spain and other European countries, or between the level of political trust in Spain in Round 1 compared with that in subsequent rounds of the time series, might be directly attributable to the short-term impact of these events. The role of the ESS is to measure social, cultural and attitudinal climate changes across Europe, rather than transitory changes in the attitudinal weather. Analysts of the data should thus ideally be in a position to distinguish between the two, or at least to identify and possibly discount the impact of short-term events on expressed attitudes. This is especially important because socio-political conditions in Europe seem to have become particularly volatile during the last decade or so. Although the ESS’s initiators had noted the need to record events when they first started their planning work in the mid-1990s, the events they had in mind at that stage were primarily national elections or short-term political upheavals. Since that time, however, we have witnessed the 9/11 attacks in the USA, major terrorist attacks in Madrid and London, a war in Afghanistan, a war in Iraq, the Darfur crisis, a devastating tsunami in Asia, political assassinations in the Netherlands, plans in some countries to suspend certain civil rights, an overwhelming ‘no’ vote in two national referendums on the EU constitution, and a fair number of political and financial scandals. Is Europe simply going through an isolated outbreak of extreme weather conditions, or is the political climate changing rather rapidly? Either way, it is increasingly important to make available this sort of background information to survey analysts of a cross-national time series. The Blueprint for the ESS (ESF, 1999) recognised this and called for an ‘event database’ to be made available alongside the substantive database. Although no funds were initially available for event reporting, we considered it important enough to make a start by developing a parsimonious but reasonably effective system. National Coordinators (NCs) and their teams in each country took on the task of producing a systematic overview of events during the ESS fieldwork period and submitting their reports for collation into a central event database, which has since been set up (http://ess.nsd.uib.no). As a matter of fact, the description of the Prestige oil spill at the beginning of this chapter is taken from this database which now contains information on media reported events that took place during the fieldwork periods of ESS Rounds 1 and 2. The remainder of this chapter goes on to describe in more detail not only the rationale behind event reporting, but also how it was implemented in the ESS. It also assesses its success to date, examines inherent
Jowell-Chapter-05.qxd
3/9/2007
6:42 PM
Page 97
If it bleeds, it leads: the impact of media-reported events
97
problems in its implementation and considers ways of improving it in future rounds. “Events, dear boy, events”2 As noted, the ESF’s Blueprint document for the ESS called for an events database, giving the following justification for it: It is well known from earlier comparative survey research that in some fields, such as electoral analysis, individual reactions to certain questions will be influenced by contextual factors and by significant events. For example, a question about the subjective interest in politics of a respondent may well be answered differently at the height of a national campaign for a general election compared to a time when no election is imminent. The contextual impact on individual response behaviour will not create major difficulties for the ESS as long as the contexts and events vary individually in an idiosyncratic fashion. The impact, however, of a contextual factor or an event must be considered and, whenever possible, controlled as soon as whole societies are thus influenced in a way which is not uniform across the countries in the ESS. In addition, it has to be remembered that the ESS will in the long run also become an important asset for historical micro analysis. As a consequence, from the beginning an information tool which for the lack of a better term may be called an event data inventory will have to be designed. This inventory must offer to the researchers a brief, pertinent synopsis of major political, social and other potentially relevant events in the ESS countries; this is particularly important for the ESS since its modular approach will in the long run cover a wide area of substantive concerns (ESF, 1999, p.33). An early example of (unexpected) consequences of major events was given by Bradburn (1969) who studied the trauma caused by the assassination of President John F. Kennedy. He found that this event not only caused feelings of shock, grief and personal loss but also occasioned an increase in interpersonal communication and social cohesion. According to an analysis by Das et al., (2005) of the more recent assassination in the Netherlands, 2
Remark attributed to the British Prime Minister Harold Macmillan when asked by a young journalist after a long dinner what can most easily steer a government off course.
Jowell-Chapter-05.qxd
98
3/9/2007
6:42 PM
Page 98
MEASURING ATTITUDES CROSS-NATIONALLY
there there was no increase in social cohesion, but instead a rise in social disorganisation and depression. In the course of an experimental study they were conducting on attitudes to terrorist attacks, the Dutch filmmaker Theo van Gogh was assassinated by a Muslim fundamentalist. The event attracted wide national and international media attention, thus complicating their initial experiment but opening up new avenues of exploration to them. In the week following van Gogh’s murder, numerous anti-Muslim attacks took place in the Netherlands http://news.bbc.co.uk/2/hi/europe/4057645.stm). Das et al. found that when terrorism occurs on one’s own doorstep (rather than in some distant land) the fear of death is magnified greatly. In the Netherlands this resulted in an increase of terror-induced prejudice. Although, of course, personal and family events may usually have a more profound impact on people’s lives and thoughts than will more distant political events, these personal sources of turbulence do not tend to have a systematic impact on survey outcomes. They are, in effect, randomly distributed across the population. In contrast, what is important for the ESS are any systematic effects of events on attitudes at a particular time or in a particular place. It is this sort of turbulence that can cause differences between countries, changes over time and variations between subgroups of the population. Events in the media From the range of events that occur in the world every day or week, only a small selection becomes salient to the public. It is these salient, well-popularised events, with high exposure either to large sub-groups in a country or to an entire country, or even to a large group of countries, that have the potential for focusing and shaping the attention of members of the public. The mass media are, of course, the primary conduits through which these potentially salient events are conveyed to the public. So our primary interest was in media-reported events. The role of the mass media has long been a subject of research in its own right. Nas (2000), for instance, provides an overview of the impact of the media on attitudes to environmental issues. She argues that in earlier decades the favoured theory of the media’s impact was the ‘hypodermic needle’ theory, suggesting that the public as passive consumers of news simply get injected with elements of news that the media choose to report. From the late 1940s, these assumptions about a passive public began to change in favour of the existence of selective filters between medium and recipient (selective attention, social networks, selective exposure and interpersonal communication). By the 1960s, however, a
Jowell-Chapter-05.qxd
3/9/2007
6:42 PM
Page 99
If it bleeds, it leads: the impact of media-reported events
99
new agenda-setting theory had found favour. As Cohen (1963, p.13) puts it: “(The press) may not be successful in telling people what to think, but it is stunningly successful in telling its readers what to think about.” The agendasetting theory does not imply that media users are merely passive agenda-followers, because it also accepts the existence of filters between the transmitters and recipients of messages (see www.agendasetting.com). Agenda-setting by the media is by no means a cross-culturally uniform process. As Pfetsch (2004, p.60) points out after studying the role of newspaper editorials in different countries on the subject of European integration: The media play a significant role as political actors as they use the format of editorials for claims-making, thereby assigning relevance and frames to political issues and introducing their own opinions into public discourse and political debate. In their dual role as communication channels of political actors and as actors in their own right they constitute the major communicative linkages within and between national public spaces which are a basic prerequisite for the Europeanisation of the public sphere. Even so, Pfetsch concludes that the role of the media is by no means confined to agenda-setting as “the media’s opinion about Europe resonates with the position of the national political elites and at the same time reinforces it” (p. 61). There were, it emerged, large differences between the media in different European countries. The British media, for instance, seemed to try hard to ignore the European perspective whenever possible, while the French national media seemed to be the most open to it. Differences between the newspapers and TV channels of different countries are also the main focus of media research conducted by the German mediawatching organisation, Medien Tenor (www.medientenor.de). They have monitored, for instance, the extent to which UN Secretary General Kofi Annan has been ignored in international TV news, the differences in the reporting of public attitudes towards the Euro in German v UK newspapers and TV stations, and the way in which the US mass media tend to reinforce ethnic stereotypes. These findings are relevant to the ESS, since the starting point of our event-reporting is to employ media reports simply as a means of highlighting the most salient events in each country. The fact that a country’s mass media may have systematic biases and that there is therefore an interaction between public attitudes and media-reported events is by no means an obstacle to our work, rather the essence of it. To some extent we have to make the further simplifying assumption that as long as an event is salient enough to be reported widely in the newspapers, it is
Jowell-Chapter-05.qxd
100
3/9/2007
6:42 PM
Page 100
MEASURING ATTITUDES CROSS-NATIONALLY
likely to have at least some impact on the consciousness of non-newspaper readers as well. Within the context of the ESS, it would not have been possible to set up a comprehensive multi-national media watch system. Instead, we had to make parsimonious choices as to which media to use as source material for eventreporting. Although there were arguments to have monitored television news, the difficulties and costs of systematically recording and coding television news bulletins across over 20 nations were simply too daunting.3 Instead we chose to monitor newspapers in the hope that television news agendas and press agendas tend to coincide in many cases. News flow and event identification What sorts of event did we wish to record? We quickly rejected the notion that events that take place far away will necessarily have less impact than events closer to home. The war in Iraq, for instance, had seemingly major (and different) national ramifications. Thus, German Chancellor Schröder’s opposition to the war may well have benefited his successful campaign in the German elections. Meanwhile, major protests about the war took place in the UK, including the resignation of some ministers, and the coalition government in the Netherlands seemed at one time to fall apart precisely because of conflicting points of view on Dutch involvement in the war. Many other countries had protest demonstrations which may or may not have affected public opinion in those countries differentially. Similarly, Round 2 of the ESS witnessed sustained media attention being devoted to hostage-taking in Iraq involving aid workers and journalists from, among other places, France, Italy and the UK. And then the tsunami in Asia on 26 December 2004 occupied press reports for weeks. But certain countries, notably Sweden, witnessed even more press attention than did other countries because they had lost hundreds of citizens in the disaster. Meanwhile political criticisms were made in several countries about the lack of appropriate support in the aftermath of the disaster, and some intended fundraising events turned quickly into major national events. Another example of a major international event with the potential for different national implications was the death of Pope John Paul II in April 2005, which attracted especially sustained coverage in Italy and Poland. 3
In the first round of the ESS Greece did tape the news for event-reporting purposes.
Jowell-Chapter-05.qxd
3/9/2007
6:42 PM
Page 101
If it bleeds, it leads: the impact of media-reported events
101
So, more relevant to us than where an event took place was when it took place. Our aim was to be able to link events to survey answers, so the ideal situation was when an event could be identified that had a clear start and end. But many events do not behave like this. Instead they linger on and sometimes re-emerge. Events can also have even later repercussions, attracting renewed attention, say one year after they took place. For instance, Maney and Oliver (2001) investigated the use of news media versus police records to track shifting patterns of protest across time and place. They concluded that neither the news media nor police records fully captured the picture and they challenged the (usually tacit) assumption that newspaper coverage of an event reasonably closely matches the event itself, noting (p.166) that “much of the event coverage appeared weeks, if not months, before or after an event’. Even elections and referendums that in most countries happen on a single day tend to have a longish period before and after during which they cast their shadow on public perceptions and attitudes. It is for this reason that several countries have postponed ESS fieldwork so as to avoid the immediate impact of national elections. But in any case not all ‘events’ are in the end directly related to their ostensible subject matter. Take the Hutton Inquiry in the UK, a judicial enquiry set up to investigate a row between the BBC and the government over the ‘exposure’ of a BBC source who then committed suicide. Although the Inquiry effectively found in favour of the government, most astute commentators believe that it resulted in a serious loss of public faith in the government. Yet in Italy, for instance, where the Prime Minister was alleged to be involved in dubious financial dealings, most commentators believe that the scandal inflicted little or no lasting damage on public trust in politicians. The fact is that different countries see seemingly similar events through quite different lenses. In the same way, the two national referendums on the EU constitution in 2005, one in France and one in the Netherlands, both took place during ‘difficult’ political times in their respective countries. So they may well have tapped hostility towards their governments at the time at least as much as hostility toward the proposed constitution. And the effect of these two ‘no’ votes may well have influenced attitudes to the EU in many other countries simultaneously. Thus, ‘simple’ events like these, in that they are closely fixed in time and place, may nonetheless have far-reaching effects on many issues in many countries, even sometimes before the actual event has taken place.
Jowell-Chapter-05.qxd
102
3/9/2007
6:42 PM
Page 102
MEASURING ATTITUDES CROSS-NATIONALLY
Guidelines and database As noted, ESS event reporting has developed alongside the ESS survey. Well before the start of Round 1 fieldwork, we conducted a trial run by asking NCs4 to collect event data for a trial period of six weeks. They were asked to record major national (or international) events that might influence the answers to substantive questions in the (then draft) questionnaire. We provided them with a prompt-list of possibly relevant political events – inspired by Taylor and Jodice (1986) (see Note 1 at the end of this chapter) – and complemented it with a number of less political events, asking them to record such events only if they appeared on the front pages of national newspapers for two or more days in the period, as well as attaining television coverage. No report format was provided. Having reviewed the outcome of the trial, we produced new guidelines for event-recording to be implemented in ESS Round 1. Each participating country was to send in monthly reports on events that received ‘prominent attention’ in national newspapers. This was defined to mean ‘front page news’ or ‘appearing regularly in larger articles on later pages’ on several days. NCs were asked to assign events to fixed categories, to provide keywords and a description, to give a start date and end date (if possible), to mention the source, and to assess the likely size and direction of the event’s impact on the survey answers. Although this work certainly produced a helpful database, it was based on a system with too much built-in leeway for individual reporting variation. This led us to revise the guidelines for Round 2, as presented in Table 5.1, so that they included a more standardised format and were based on weekly, rather than monthly, reports. The NCs were also required to provide information on the newspapers they had used.5 Information could be collected from the newspapers themselves or from websites containing the newspapers. In Round 2, reporting started two weeks before the start of fieldwork, which differed somewhat across countries, and NCs were also asked for a short overview of any major events in their country since Round 1 that 4 In several countries event-reporting was done by a special “event-reporter” on behalf of the NC. 5 Based on the experiences of Round 1 the original request to national reporters was to use two newspapers, a broadsheet and a tabloid, or – even better – a left-wing broadsheet, a right-wing broadsheet and a tabloid. However, feedback from NCs revealed that this was not always the best option in certain countries, both because tabloids and broadsheets may not be comparable across countries, and because some countries (such as Switzerland) have very different newspapers in different regions, in some cases in different languages.
Jowell-Chapter-05.qxd
3/9/2007
6:42 PM
Page 103
If it bleeds, it leads: the impact of media-reported events
Table 5.1 types
103
Framework for event-reporting, ESS 2004, and variety of entry
Information requested Explanation
Examples
Name
Name of specific event
Minister of Education steps down after school fraud 500 000 turnout at demonstration against care budget cuts Paid parental leave: Mum can stay at home now The name of a speGreece are European football champions cific event may be an Fundamentalist Muslims accused of terrorism interpretable Housing market collapses newspaper headline Scathing judgment on quality of childcare (but not ‘Dust to dust’, Tornado in Toledo or ‘Double Dutch’ or Kidnapping in Iraq ‘Home alone’ or Herb cure saves lives ‘Trojan horse victory’) Prince Claus dies Hospital scandal: 30 patients infected Major credit card fraud Opening of parliamentary year: the future looks bleak Opinion poll on democracy: all in for personal gain Low turnout at EU referendum
Category
Select one or more categories; add category if necessary (highlight)
Election (national, local), plebiscite, referendum Resignation, appointment, dismissal of politically significant person Fall of cabinet, change of government, new government Significant change in law Strikes, demonstrations, riots (mention topic) Acts of terrorism Events involving ethnic minorities, asylum seekers Events concerning the national economy, labour market Political, financial, economic scandal, frauds National events (royal weddings, sports championships) Health issues Family matters Crimes (kidnappings, robberies) Disasters (outbreaks of foot and mouth/mad cow disease, extreme weather conditions) International conflict with national impact (Israel–Palestine; Iraq, Pakistan) Major international events that attract close attention locally
Short description
Similar to header in newspaper or introduction of news item
Prince Claus has died after 20 years of serious health problems. The nation mourns. Prince Claus was beloved by many Dutch people for his original contribution to being a Prince. He has become famous for his contributions to developing countries. Many people come to pay him their last respects
Timing
Date event in media, date event, duration (sudden, continuing)
Prince Claus died on 6 October and was buried on 16 October. Wide media coverage of his life, his lying in state and the tribute paid to him by Dutch citizens and dignitaries, and funeral during these 10 days.
(Continued)
Jowell-Chapter-05.qxd
3/9/2007
6:42 PM
104
Table 5.1
Page 104
MEASURING ATTITUDES CROSS-NATIONALLY
(Continued)
Information requested Explanation
Examples
Coverage
Attention in media
All national newspapers and TV joprogrammes, front page tabloidsurnals, extra breaking news
Source
Which newspaper/ website
Web link
Only if free and (semi)-permanent
http://news.bbc.co.uk/2/hi/health/3856289.stm
Link to questionnaire
If direct relationship with identifiable question blocks
B18: lawful demonstration (when large demonstration) B19: consumer boycott (when large consumer boycott) B12: satisfied about state of education (when educational abuses denounced) B34: European unification (when heated debates occur, e.g. discussions on Turkey in EU) C1: How happy are you (when your country wins European football match)
Possible effect on fieldwork
Areas closed off because of animal diseases; heavy storms; confidentiality scandals
Additional information
All additional information
might have shaped or altered public attitudes or perceptions. This overview was simply to provide some idea of (changes in) the political landscape. From the beginning of Round 1 incoming event reports have been posted on our website (accessible via http://ess.nsd.uib.no) to give an up-to-date overview of incoming event reports, and also to show guidelines, information notes and background information about the process. This transparency was helpful not only to users who wished to have an overview of weekly or monthly events in each country, but also to NCs themselves as a way of checking how their colleagues in other countries were using the system. The web page also contains FAQ’s and ultimately provides the final ESS media-reported event inventory for each round. In Round 3 the procedure is very similar to the one in Round 2 with one difference: events reports can now be uploaded by the national event reporters themselves and will be part of a more structured database. As a result different types of overview (per week, per country, per keyword) will be easier to obtain. As Round 3 of the survey is still in the field at the end of 2006, no substantive results are yet available.
Jowell-Chapter-05.qxd
3/9/2007
6:42 PM
Page 105
If it bleeds, it leads: the impact of media-reported events
105
Meanwhile, what was happening in Europe? In March 2003 the Coalition Forces attacked Iraq, by which time ESS Round 1 fieldwork had ended in most countries. Even so, the preparations for and the threat of war had been a major ‘event’ during the Round 1 fieldwork. As noted, what happened in Iraq was not simply an event in a faraway country but had an impact very close to home, even in European countries that were not to be engaged in the war. Other faraway events that drew a great deal of attention were the Bali terror attack, the North-Korean nuclear threats and the Chechnyan hostage disaster in Moscow. At the start of fieldwork there were also devastating floods in central Europe and Belgium. In 2002/2003 the economic situation in Europe deteriorated. There were redundancies, bank ruptcies and shutdowns. Although some national events, such as the Prestige oil spill, ETA terrorism attacks in Spain, conflicts within Haider’s freedom party in Austria, the Pim Fortuijn murder in the Netherlands, and the Israel–Palestine conflict, did make it to the front pages of foreign newspapers, their impact on those other countries was minimal. Other national events had greater cross-national echoes, but not necessarily with the same meaning or impact. In almost half of the countries, elections took place. There was also a good few political and financial scandals, several strikes and demonstrations, and, of course, the inevitable sporting triumphs and disasters. Meanwhile, immigration was a rising issue in several countries, as was EU enlargement. Whereas Round 1 of the ESS was accompanied by preparations for the Iraq war, Round 2 took place during the war itself. So among the events recorded were a large number of terror attacks and hostage-takings, as well as – in January 2004 – the national elections in Iraq. Other recorded international events in Round 2 were the Darfur conflict in Sudan, the Beslan school siege in Russia, bomb attacks in Jakarta and Egypt, elections in Afghanistan, and – most prominent of all, perhaps – the tsunami disaster in Asia on 26 December. The economic situation in much of Europe had not improved substantially in comparison with Round 1, and the issue of immigration continued to rumble on. During fieldwork in the final months of 2004, there was a seriously contested election in Ukraine, one of the ESS countries, and the Theo van Gogh murder in the Netherlands, followed by an outbreak of anti-Islam incidents and anti-terror raids. Early in December 2004, the PISA report on education was published to widespread press attention in many countries, the US presidential election resulted in a second term for George W. Bush, and Yassir Arafat died. Meanwhile Pope John Paul II had become terminally ill (he died in April 2005).
Jowell-Chapter-05.qxd
106
3/9/2007
6:42 PM
Page 106
MEASURING ATTITUDES CROSS-NATIONALLY
In 2004, 10 new countries entered the EU, of which six (the Czech Republic, Estonia, Hungary, Poland, Slovakia and Slovenia) were already participating in ESS Round 2. All but Estonia and Slovakia had also participated in Round 1, enabling changes to be monitored over time. Other major EU events during Round 2 were the start of the accession talks with Turkey in December (also an ESS country) and the rejection of the proposed new EU constitution by referendums in two other ESS countries – France and the Netherlands – leading to the abandonment of planned referendums in other countries. The overview above is simply an impressionistic view of some major media stories that broke during and around the fieldwork of ESS Round 2, and which may have had a one-off impact on expressed public values. The event database is more crowded and provides more detail. A proper academic analysis of media-reported events in the period would, however, have required a special coding operation for which we did not have a budget. It would have recorded more precise details of the timing of each event, the differential exposure of particular events by country, and the coding of specific events within general classes. Thus, the Iraq hostagetaking, the van Gogh murder and the Beslan school siege would become discrete events in the coding rather than being classed simply as ‘acts of terrorism’. Similarly, the elections in Ukraine would not be classified simply as an election but as a specific and more prolonged event. Stathopoulou (2004) used a sophisticated combination of linguistic processing of textual data, correspondence analysis and cluster analysis on the ESS Round 1 data to produce a set of event clusters per month, some general and some specific. Her results make it possible to distinguish groups of events from those that relate to a single country and to follow events over time, identifying similarities and differences between countries. Figure 5.1 presents the results of a much simpler correspondence analysis based on word counts of the event reports in October 2003. Events that were infrequently mentioned or related only to a single country have been removed (such as Haider’s success in Austria or the Prestige oil spills in Spain). Data for this correspondence analysis are the number of times a particular word is mentioned in a particular country. Thus, Finland, Norway, Sweden, Flanders, Portugal and the Netherlands are all in the upper right quadrant of Figure 5.1, because they were all characterised by a greater than average number of stories involving their economies, including business and financial scandals. In contrast, the Czech Republic, Austria, Slovenia,
Jowell-Chapter-05.qxd
3/9/2007
6:43 PM
Page 107
If it bleeds, it leads: the impact of media-reported events
Figure 5.1
107
Countries and events, October 2003, weighted by word count
Hungary, and Austria are all in the lower right quadrant because they were more than averagely involved – as was Denmark – with stories about EU accession and enlargement. Meanwhile Switzerland was at that time somewhat pre-occupied with demonstrations and strikes, mainly in their milk processing industry. Strikes and demonstrations were also rife in Israel and Italy at the time. For instance, the event report for October 2003 from Italy started as follows: Tens of thousands of Italian workers have been taking part in rallies as part of a general strike to protest against labour reforms and budget cuts by the government of Prime Minister Silvio Berlusconi. The strike (there were demonstrations in 120 towns all over Italy), called by Italy’s largest and most left-wing trade union, CGIL, caused chaos in the transport sector with air, rail and local transport severely affected.
Jowell-Chapter-05.qxd
3/9/2007
108
6:43 PM
Page 108
MEASURING ATTITUDES CROSS-NATIONALLY
Similarly, Israel’s record covering no more than a single week in October 2003 contained the following three reports: October 13: some 100,000 workers in municipalities, local authorities and religious councils launched an open-ended strike. October 16: the strike spread to sea ports. Due to the strike, most government offices were closed all week while most local services. October 19: the continued to be halted by municipal workers. Histadrut decided to suspend the strike for a while due to the security situation. So, despite less than standardised event-reporting from different countries, our subsequent coding and analysis seemed to capture the gist of the major events that might have affected answers to the ESS questionnaire at the time. And this was precisely the aim of the event database – not merely to provide a view of what happened in Europe during the period of ESS fieldwork, but also to track the possible impact of events on attitudes and opinions. On the other hand, our first examination of attitudes over the period of ESS Round 1 itself produced no clear picture of the short-term impact of particular events on attitudes. There was, for instance, no clear impact on trust in politicians or institutions (not even the UN) that seemed to stem from the considerable political turmoil at the time mainly over the Iraq war. The fact is that measuring the impact of events on attitudes is a highly uncertain and complex affair. For instance, despite the political turmoil in the Netherlands as a result of the Theo van Gogh murder, responses to a question on freedom of speech in a regular Netherlands-based survey on cultural change (Verhagen, 2006) showed an abrupt change in the immediate aftermath, but went rapidly back to the original level. As Bradburn (1969) has recommended, the event reports need to be supplemented by more systematic research on psychological reactions to significant events. Although events may alter attitudes, their effect may be highly specific and short-lived. The ESS was, of course, set up to measure long-term climate changes in attitudes rather than short-term changes in public opinion. Thus, our interest in event-reporting is not so much in the short-term impact of a particular event per se, but rather in the way it might affect or distort the measurement of long-term trends. Looking ahead Our experience to date with event reporting in the ESS has already shown us that – useful as it undoubtedly is – it could certainly be improved upon.
Jowell-Chapter-05.qxd
3/9/2007
6:43 PM
Page 109
If it bleeds, it leads: the impact of media-reported events
109
Several issues remain to be solved, among them our perhaps too strong emphasis on front page stories, our unresolved difficulties in the coding of events, and our as yet less than standardised approach to the whole reporting process. In particular, our concentration on front page reports was motivated by our wish to cover the most important and potentially most impact-making stories. But not only do front pages differ across newspapers, but newspapers also differ across countries. Some front pages concentrate heavily on sensationalist stories (‘if it bleeds, it leads’), while others focus more on stories of national importance. Either way, our focus on front pages may miss important cross-national similarities thus exaggerating cross-cultural differences. For this and other reasons, events might in future have to be taken from articles on other pages and possibly from editorial as well as ‘op-ed’ pages too. In any case it has been clear to us from the outset that for a multilevel study such as the ESS, the parsimonious method of narrative event reporting that we have adopted so far can be no more than a short-term device. We need to move on to more rigorous methods, incorporating coded events to a standardised frame. An example of such methods is the Kansas Event Data System project (KEDS, www.ku.edu/~keds/index.html), which was initially designed to develop appropriate techniques that could convert English-language reports of political events into standardised event data. By classifying events in the first place we might be able to move on to a system of transparent and reproducible automated coding. An overview of such automated methods is available at www.ku.edu/~keds/papers.html (see also Schrodt, 2001). Colleagues at ZUMA (Cornelia Züll and Juliane Landmann, 2003) have already begun carrying out experiments on ESS event data using automatic coding. Our present event material in the ESS, based as it is on reports from NCs, may be too dependent on subjective choices to be a good base for such coding. They have now started to use the original newspaper articles as an alternative source. This poses the additional problem of language: the ESS event reports are in English whereas the newspapers used are in more than 25 languages. In a collaborative effort between the Social and Cultural Planning Office in the Netherlands, City University, London, the University of Bristol, and ZUMA in Germany plans are being made to develop a more standardised, impartial, comprehensive and accessible tool for event reporting. This work is being funded by the European Commission as part of their infrastructure support of the ESS. This new tool will also use newspapers as the primary source of event reports, but the electronic version from the Lexis-Nexis database rather than the paper version. Lexis-Nexis
Jowell-Chapter-05.qxd
110
3/9/2007
6:43 PM
Page 110
MEASURING ATTITUDES CROSS-NATIONALLY
provides an on-line search mechanism of newspapers in a wide variety of countries, but – where no such coverage exists – we can buy and store the relevant newspapers (intact or on micro-fiche) to be quarried at a later date. Human coders will code events post hoc. This allows for the coding of different countries to proceed at different paces from one another. The possible implementation of automatic coding will have to be studied. By selecting a wide range of newsprint media, we should ideally be able to retrieve the salient events that have occurred in different types of newspaper (left/right; elite/mass readership) in all countries and to develop a systematic coding frame of event variables (time of event, place, actor, geographical scope, etc.). The resulting coded event database would ideally also provide the means by which researchers would be able to derive electronically not only a description of the events, but also their relative salience in different countries. The database would also be able to be interrogated in relation to specific time periods and/or issue fields, making it possible to find clues as to whether any special national or cross-national factors have influenced the responses in particular rounds of the ESS. For the moment we estimate that around four newspapers per country would suffice for this purpose – covering left-leaning and right-leaning elite newspapers, and the two most ‘popular’ newspapers, drawing samples from each of their main news sections. It is, of course, possible that these plans may prove to be over-ambitious. It is too early to tell. So far, however, they appear to be feasible and well worth pursuing. And to the extent that we succeed, these sorts of mediareported event could develop into an important resource not just for the ESS, but for cross-cultural time series in general. Notes 1. The Inter-university Consortium for Political and Social Research (ICPSR, www.icpsr.umich.edu/org/index.html) holds several studies on political events, often within a particular period and within a particular area. An interesting example is the World Handbook of Political and Social Indicators series (Taylor and Jodice, 1986) which contains a large number of daily, quarterly and annual political events all over the world, from Afghanistan to Zimbabwe, and in addition aggregate political, economic, and social data and rates of change for 155 countries. Events covered include demonstrations, riots, strikes, assassinations, elections, referendums, and imposition of political restrictions, including censorship, and in particular periods also bombing, ambush, raid, arrest, release of the arrested, imposition of martial law or curfew, and relaxation of martial law or curfew.
Jowell-Chapter-05.qxd
3/9/2007
6:43 PM
Page 111
If it bleeds, it leads: the impact of media-reported events
111
References Bradburn, N.M. (1969), The Structure of Psychological Well-Being, Chicago: Aldine Publishing. Available on-line from NORC Library: cloud9.norc.uchicago. edu/dlib/ spwb/index.htm Cohen, B. (1963), The Press and Foreign Policy, Princeton: Princeton University Press. Das, E.H.H.J., Bushman, B.J. and Bezemer, M. (2005), ‘The impact of terrorist acts on Dutch society: The case of the Van Gogh Murder’. Presentation at the First EASR Conference, Barcelona, July 2005. ESF (European Science Foundation) (1999), Blueprint for a European Social Survey, Strasbourg: ESF. Maney, G.M. and Oliver, P.E. (2001), ‘Finding Collective Events, Sources, Searches, Timing’. Sociological Methods & Research, 30 (2), pp.131–169. Nas, M.A.J.C. (2000), Sustainable Environment, Unsustained Attention. A Study of Attitudes, the Media and the Environment, The Hague: SCP. (Full text in Dutch, abstract and summary available in English from: www.scp.nl/english/publications/summaries/ 9057495244.html) Pfetsch, B. (2004), The Voice of the Media in European Public Sphere: Comparative Analysis of Newspaper Editorials. Available on-line from: http://europub.wzberlin.de/project%20reports.en.htm Schrodt, P.A. (2001), ‘Automated Coding of International Event Data Using Sparse Parsing Techniques’, Paper presented at the International Studies Association, Chicago, February 2001. Stathopoulou, T. (2004), ‘Modelling Events for European Social Survey: Towards the Creation of an Autonomous Tool for Survey Research’, Paper presented at the Sixth International Conference on Social Science Methodology, Amsterdam, The Netherlands, August 2004. Taylor, C.L and Jodice, D.A. (1986), World Handbook of Political and Social Indicators III: 1948-1982 [Computer file]. Compiled by C.L. Taylor, Virginia Polytechnic Institute and State University. 2nd ICPSR edition, Ann Arbor, MI: University of Michigan, Inter-university Consortium for Political and Social Research (producer and distributor). Verhagen, J. (2006), ‘Robuuste meningen. Het effect van responsverhogende strategieën bij het onderzoek Culturele Veranderingen in Nederland’, The Hague: SCP. Züll, C. and Landmann, J. (2003), European Social Survey and Event Data, Working Paper, Mannheim, Germany: ZUMA.
Jowell-Chapter-05.qxd
3/9/2007
6:43 PM
Page 112
Jowell-Chapter-06.qxd
6
3/9/2007
6:44 PM
Page 113
Understanding and improving response rates Jaak Billiet, Achim Koch and Michel Philippens∗
Introduction The ESS was from the outset designed to be a high-quality research instrument for the social sciences. One way in which the quality of a survey is often measured is its overall response rate – not an unreasonable premise since the higher the proportion of its target respondents who participate, the more reliable are its results likely to be. Although this somewhat oversimplifies the issue, the headline co-operation rate and the associated issue of nonresponse bias nonetheless remain central to survey quality. So if the ESS was to place emphasis on its methodological quality, response rates were inevitably a key variable, though, of course, by no means the only variable.1 This chapter refers to the issue of survey participation and its effect on the quality of survey findings. There will, of course, always be some designated respondents in a survey who cannot be located by the interviewers during the fieldwork period (non-contacts). There are also others who are contacted but then decline to participate (refusals). And there are still others who simply cannot participate because of, say, illness or language problems (unable to answer). In cross-national surveys in particular, non-response can threaten the validity of comparisons between nations. *
Jaak Billiet is professor of social methodology at the Katholieke Universiteit Leuven, Centre for Sociological Research. Michel Phillipens was formerly a research assistant at the same institute; Achim Koch is a Senior Researcher at the European Centre for Comparative Surveys (ECCS) at ZUMA, Germany. 1 All methodological quality measures are documented on the ESS website (http://www.europeansocialsurvey.com/) and on the ESS data archive website (http://ess.nsd.uib.no/).
Jowell-Chapter-06.qxd
114
3/9/2007
6:44 PM
Page 114
MEASURING ATTITUDES CROSS-NATIONALLY
In a review of the literature on non-response in cross-national surveys, Couper and De Leeuw (2003, p.157) comment: “Only if we know how data quality is affected by non-response in each country or culture can we assess and improve the comparability of international and cross-cultural data.” The most important question in this context is whether non-response leads to bias in the resulting survey estimates. This will be the case when respondents and non-respondents differ systematically with respect to different survey variables, in which case the generalisability of the survey results to the target population and the comparability of results across countries might potentially be put at risk. Despite their obvious importance, non-response issues are often ignored in cross-national surveys. For some reason, the strict standards that are applied to the evaluation of national surveys are often suspended when it comes to cross-national studies (Jowell, 1998). We describe in this chapter the measures we have introduced in the ESS both to reduce non-response and to derive information about non-response. Our focus here is on non-contacts and refusals. By discovering the particular factors affecting non-contacts and refusals in different ESS countries, we hope to find ways of improving response rates in future rounds of the survey and, we hope, in similar studies. Our data come primarily from an analysis of Round 1 ESS contact forms in which interviewers in all countries are required to record the mode, time and outcome of all contact attempts they make. In addition, we make use of aggregate level data for each country from the National Technical Summaries (see chapter 7) which all countries provide when delivering data to the ESS archive. We use a pragmatic approach to data quality assessment in which process and outcome variables are treated as equally important (Loosveldt et al., 2004). Thus our evaluation of data quality deals not only with each step in the process of data collection (i.e. the contact attempts of interviewers), but also, of course, with the overall outcomes of the survey (response rates, comparability of estimates with known distributions in the population, and so on). Bringing these two approaches together allows us to develop an analytical tool that will become more powerful at each new round of the ESS, allowing us to draw practical conclusions about how to improve data quality. So in this chapter we will look first at the standards and documentation required by the ESS to maximise its response rates, to reduce bias and to provide for the analysis of these phenomena at a macro and micro level. We will examine ESS Round 1 fieldwork and assess how it performed from a quality perspective looking in detail at both response and non-response. We will try to identify some of the reasons for differences between countries on these measures with an emphasis on factors that can be influenced by the research design and its implementation. In particular we will examine the effectiveness of efforts to reduce non-contacts and convert refusals. We will then reflect on the extent to which non-response creates bias in the substantive
Jowell-Chapter-06.qxd
3/9/2007
6:44 PM
Page 115
Understanding and improving response rates
115
findings of the ESS. For convenience, we refer in all cases to Round 1 of the ESS, but the same basic story can be told of Round 2. Response quality: standards and documentation When setting up the ESS we developed, in co-operation with other experts, a set of methodological standards that had to be pursued in each participating country. The quality standards we set were based not on the lowest common denominator across all countries, but oriented at those countries which were normally associated with the highest research quality (Lynn, 2003). In relation to response rate enhancement, the standards and specifications were as follows: • Data had to be collected by face-to-face interview • Interviewers had to make at least four visits (with at least one evening and one weekend visit) before a case could be abandoned as non-productive • Unless a country had an individual-named sampling frame with accompanying telephone numbers, all visits – including the first contact – had to be made in person • Substitution of difficult to reach or reluctant target persons was not permitted under any circumstances • All interviewers had to be personally briefed on the survey prior to fieldwork • The workload of any single interviewer was limited to a maximum of 48 issued sampling units • Fieldwork was to be closely monitored, including producing fortnightly reports on response • The fieldwork period had to cover at least 30 days. In addition to these basic standards, challenging targets were set. With respect to response rates, countries were asked to aim for (and budget for) a response rate of at least 70 per cent. Although we realised that this response rate would be very challenging for some countries (to say the least), we thought it appropriate to aim as high as possible both to raise the lowest response rates and to not depress the highest ones. To help countries reach this target response rate they were encouraged to implement a set of Current Best Practice guidelines, which included: • Selecting the most experienced interviewers whenever possible • Boosting interviewers’ confidence about their abilities • Briefing all interviewers in personal training sessions lasting at least half a day
Jowell-Chapter-06.qxd
116
3/9/2007
6:44 PM
Page 116
MEASURING ATTITUDES CROSS-NATIONALLY
• Training interviewers in doorstep introductions and persuasion skills • Considering the use of incentives for respondents • Reissuing all “soft” refusals and as many “hard” refusals as possible. It was, however, clear that simply setting standards and targets would not be enough (Park and Jowell, 1997). We also had to introduce careful monitoring, evaluation and feedback. If the ESS aimed to improve standards more generally, then it needed to document and report on the extent to which standards are met. By feeding back information on compliance with or deviations from standards into the survey process, actions can be taken to improve procedures and standards round by round. Thus the Central Coordinating Team (CCT) carefully documents nonresponse and requires the National Technical Summaries to include: • Length of fieldwork period • Payment and briefing of interviewers • Number of visits required (including the number of visits required to be in the evenings or at weekends) • The use of quality-control back-checks • The use of special refusal conversion strategies2 • The use of advance letters, brochures and respondent incentives • The distribution of outcome codes for the total issued sample, according to a pre-defined set of categories. Indeed, we went even further in documenting non-response. We have standardised information on non-response not only at the aggregate level for each country, but also at the level of each individual sample unit. As noted, every country had to use contact forms to record detailed fieldwork information at each visit. Developing such uniform contact forms in the context of a crossnational survey was a rather complex task. We first needed to make an inventory of contact forms used by several European survey organizations, and we then had to develop separate versions of contact forms for each class of sampling frame and selection procedure used in the ESS. We had to strike a delicate balance between the burden this process would place on interviewers who had to record the data and the necessity to have detailed contact records available for subsequent analysis (Devacht et al., 2003; Stoop et al., 2003). In the end we were able to produce a standardised contact form specification and resultant standard data file comprising information on: 2
We use the term ‘refusal conversion’ because it is widely used in the methodological literature. This does not refer to a ‘flat’ or ‘final’ refusal. It would perhaps be more appropriate to refer to ‘repeated attempts to persuade initially reluctant persons to reconsider their participation in the survey’.
Jowell-Chapter-06.qxd
3/9/2007
6:44 PM
Page 117
Understanding and improving response rates
• • • • • • •
117
Date and time of each visit Mode of each visit (face-to-face vs. telephone)3 Respondent selection procedure in the household Outcome of each visit (realised interview, non-contact, refusal, etc.) Reason for refusal, plus gender and estimated age of target person Neighbourhood characteristics of each sample unit Interviewer identification.
Most countries (17 out of 22)4 successfully delivered a complete call record dataset. No comparable information was made available from five countries5 for a number of reasons. In some cases, the survey agencies were not familiar with the collection of call record data and found the burden too heavy; in others restrictive confidentiality laws prevented the release of information about refusals, non-contacts or even neighbourhood characteristics (Devacht et al., 2003; Stoop et al., 2003). The conduct of fieldwork In order to cope with the high methodological standards of the ESS, only survey organisations capable of carrying out probability-based surveys to the highest standard of rigour were to be appointed. This was clearly easier to achieve in some countries than in others, depending on the prevalence of high-quality survey practices. As it turned out, the majority of countries in Round 1 (12 out of 22) selected a commercial survey agency, four selected a university institute, three a non-profit survey organisation, and the other three countries their national statistical institute (see appendix for details). As noted, the prescribed method of data collection was face-to-face interviewing, with countries free to choose between traditional paper-and-pencil interviewing (PAPI) and computer-assisted interviewing (CAPI). In Round 1, 12 countries used PAPI and 10 used CAPI. The fieldwork duration and period specified for Round 1 was at least one month in a four-month period between September and December 2002. By trying to make national fieldwork periods broadly coincide, we would, we hoped, help to reduce the impact of external events on the findings (see chapter 5). On 3
Under certain specified circumstances, some contacts were permitted to be by telephone. 4 Austria (AT), Belgium (BE), Germany (DE), Finland (FI), Great Britain (GB), Greece (GR), Hungary (HU), Ireland (IE), Israel (IL), Italy (IT), Luxembourg (LU), Poland (PL), Portugal (PT), Spain (ES), Switzerland (CH), The Netherlands (NL) and Slovenia (SI). 5 Czech Republic (CZ), Norway (NO), Sweden (SE), Denmark (DK) and France (FR).
Jowell-Chapter-06.qxd
118
3/9/2007
6:44 PM
Page 118
MEASURING ATTITUDES CROSS-NATIONALLY
average, fieldwork took 124 days, but there were large differences between countries. The shortest fieldwork period was 28 days in Hungary, the longest 240 days in Austria. Only five countries managed to complete their fieldwork in 2002, and another seven before the end of February 2003. The last country finished fieldwork in December 2003, though it started fieldwork very late too. Problems in obtaining the necessary funding were the main reason for the delays observed. But in any event, our pursuit of simultaneous fieldwork periods in all countries proved to be less than successful, to say the least. Fourteen of the 22 countries each achieved their specified sample size requirement of at least 2000 achieved interviews (or 1000 interviews in countries with a population of less than two million – see chapter 2). There were a number of reasons that the other eight countries did not meet their target, among them lower budgets than necessary and in some cases lower response rates than anticipated. In the remainder of this chapter we focus on the analysis of our unique dataset (at least in a cross-national context) of response and non-response data at an individual and aggregate level. We focus in particular on the information we gathered about response rates, refusals and non-contacts. Response and non-response The call record data collected in the ESS offer the advantage that the same non-response outcome definitions and non-response rate formulae may be used across countries, thereby enabling valid cross-national non-response comparisons. As noted, not all countries delivered a dataset containing the necessary information. So, for countries with no suitable call record data, we report response and non-response rates that we calculated on the basis of the information provided in the National Technical Summaries, recognising that they may not be directly comparable and need to be treated with due caution. Before referring to the response rates themselves we must describe the definitions and formulae used to calculate them. We needed in the first place to construct an overall non-response disposition of each sample unit, since the call record dataset did not contain a variable that expressed this in its final form. Instead, we had to combine the separate outcomes of the separate calls into a single final code. This could be done either by taking the outcome of the last contact (with any member of the household) as the final nonresponse code (see AAPOR, 2004), or by setting up a priority order and then selecting the outcome with the highest priority (see, for instance, Lynn et al., 2001). Thus a refusal code that comes early in a sequence of visits may be given priority over a non-contact code at a subsequent or final visit.
Jowell-Chapter-06.qxd
3/9/2007
6:44 PM
Page 119
Understanding and improving response rates
119
We chose to use a combination of these two approaches. Thus, we took the outcome of the last contact as the final non-response code, except when a refusal occurred at an earlier visit, and subsequent contacts with the household resulted in other eligible non-response outcomes. In these cases, we took the final non-response code to be “refusal to participate” (Philippens and Billiet, 2004). When a non-response code was followed by a response because of successful conversion attempts, then the final outcome became a response code because it had a higher priority in the coding procedure. With respect to the definition of outcomes we classify as ‘refusals’ an unwillingness to participate whether by a proxy, a household member on behalf of the whole household, or by a respondent. Similarly, people were classified as refusals if they broke their appointments, were at home but did not answer the door, or broke off the interview in its prime. Non-contacts, on the other hand, are defined as those addresses or households at which no contact with
Table 6.1
Response, refusal and non-contact rates
Country
Response rate %
Non-contact rate %
Refusal rate %
Eligible sample size
Total sample size
79.6 73.3 72.2 71.8 70.9 70.3 69.0 68.8 68.4 67.8 65.0 64.4 60.6 59.3 55.0 53.7 53.6 43.4 43.2 43.0 33.0
1.7 1.4 0.8 2.4 3.0 3.2 4.0 3.2 4.6 2.5 3.0 8.1 10.1 4.5 3.5 5.9 7.9 2.8 6.9 – 2.0
16.9 20.9 19.6 15.3 21.3 15.1 21.0 26.9 23.0 26.2 25.0 22.9 27.0 25.6 30.6 29.3 35.3 45.8 37.0 – 55.1
3222 2728 2921 2114 3523 2398 2878 2196 2143 3486 3109 3179 3725 3204 3730 5436 3227 2778 3589 – 4652
3227 2766 2978 2175 3600 2484 3000 2366 2150 3570 3215 3185 3828 3340 4013 5796 3657 3000 3773 – 5086
GR FI PL SI IL HU SE PT DK NL NO IE AT BE GB DE ES IT LU CZ 1 CH 2
Notes: 1No detailed information is available for the Czech Republic. 2For Switzerland, two approaches were followed. The first included face-to-face recruitment and the second telephone recruitment. In this paper we report only on the telephone part of the survey, since the contact form data for the face-to-face part was not suitable for analysis Source: Contact forms data file
Jowell-Chapter-06.qxd
120
3/9/2007
6:44 PM
Page 120
MEASURING ATTITUDES CROSS-NATIONALLY
anyone was made at any visit. But respondents who moved within the country and were not re-approached were not treated as non-contacts, so as to enhance comparability between household and individual-named samples on the one hand, and address samples on the other. The response rates, refusal rates and non-contact rates are shown in Table 6.1. All figures are expressed as percentages of the total eligible sample size. In effect, the eligible sample comprises all addresses or households selected that were residential and occupied by residents aged 15 and over. The figures in Table 6.1 illustrate that about half of the participating countries obtained response rates close to or higher than the specified target rate of 70 per cent. But it also shows rather large differences between countries. Some countries (Greece, Finland, Poland, Slovenia, Israel and Hungary) achieved response rates higher than 70 per cent, while others (Italy, Luxembourg, the Czech Republic and Switzerland) obtained response rates lower than 50 per cent. These large non-response differences could raise questions about the validity of cross-national comparisons. But the distribution of the non-response appears to be rather similar across countries, and refusals are comfortably the most common reason of non-participation. Our aim to keep non-contact rates in all countries to a strict minimum (target 3 per cent or less) seems to have been achieved in most cases. In 16 of the countries non-contact rates in Round 1 were lower than 5 per cent. The exceptions were Austria (10 per cent), Ireland (8 per cent), Spain (8 per cent), Luxembourg (7 per cent) and Germany (6 per cent). Action was taken in Round 2 and beyond to try to increase contact rates and therefore overall response rates by an increase in the number of call attempts and greater variety in the call patterns. One remarkable observation based on the Round 1 (and Round 2) data is that the well-documented problem of non-response in the Netherlands is not replicated in the ESS (see, for instance, De Heer, 1999; Hox and De Leeuw, 2002; Stoop and Philippens, 2004). The Netherlands response rate was in fact close to the specified target rate of 70 per cent, and in the next section we will show how this result was achieved. Why such large country differences in response rates? Many factors may be responsible for the observed differences in response rates. On the one hand, societal factors that cannot, of course, be influenced by the research design may play a part (Groves and Couper, 1998). There are, for instance, differences in the ‘survey climate’ across countries. Not only do survey practices vary by country, but so do public attitudes to surveys and the extent
Jowell-Chapter-06.qxd
3/9/2007
6:44 PM
Page 121
Understanding and improving response rates
121
to which people consider them useful or legitimate. These survey-climate factors may influence overall co-operation and refusal rates. But apart from these considerations there is also variation in the ‘at-home’ patterns in different countries. These at-home patterns influence the contactability of households and will affect the efforts needed to bring down non-contact rates. Given the large demographic variations between countries (with respect to birth rates, proportion of women in employment, and so on), ‘at-home’ patterns are likely to vary rather strongly across countries (see De Leeuw and De Heer, 2002). Although these survey-climate and at-home patterns are interesting and important from a theoretical point of view, they have limited practical importance since they cannot be manipulated by the research design (other than, for instance, by appropriate planning of fieldwork around the prevailing ‘at-home’ patterns). More pertinent in this context are factors that are, at least in principle, under the control of the researcher. According to De Heer (1999, pp.136–137) these can be divided into three categories: • General design factors: mode of data collection, survey method (panel vs. cross-sectional), observational unit (household vs. individual). • Fieldwork efforts: number of contact attempts, refusal conversion efforts, interviewer and respondent incentives, and interviewer training. • Survey organisation: conditions of employment of interviewers, supervision of interviewers. Our analysis focuses primarily on the study of differences in the second category (fieldwork efforts). We first discuss the number and timing of contact attempts and possible explanations for differences in non-contact rates, and then move on to a comparison and evaluation of refusal conversion attempts. Country differences in non-contact rate reduction In order to minimise fieldwork variation between countries, the ESS specifies a common calling strategy in all countries. Interviewers are required to make at least four personal visits to each sampling unit before abandoning it as nonproductive, including at least one attempt in the evening and at least one at the weekend. Moreover these attempts have to be spread over at least two different weeks. But even when these instructions were scrupulously followed, there were often significant differences in contactability. Many countries decided to exceed these minimum requirements and were encouraged to do so. For instance, Irish, Slovenian and Greek interviewers were required to make at least
Jowell-Chapter-06.qxd
122
3/9/2007
6:44 PM
Page 122
MEASURING ATTITUDES CROSS-NATIONALLY
five contact attempts, while Polish and Slovenian interviewers had to make at least two evening calls.
Contact procedures The first contact with potential respondents, often following an advance letter, was required to be in person. Only after initial contact had been made could interviewers subsequently make appointments on the telephone for a face-to-face interview. As noted, however, the restriction on making initial contact by a personal visit was relaxed for countries with registers of named individuals that included telephone numbers. Analysis of the call record data shows that only Switzerland, Sweden, Finland and Norway used mainly telephone attempts to recruit respondents. Everywhere else, almost all contact attempts were made face to face.
Number of contact attempts It is generally assumed that increasing the number of contact attempts is the most effective strategy for decreasing non-contact rates. Figure 6.1 plots the average number of call attempts made to non-contacts against the percentage of non-contacts in the eligible sample. As we would expect, there is a negative relationship between the achieved non-contact rate and the average number of contact attempts (Spearman’s rho = -0.41). Figure 6.1 indicates that countries such as Germany, Belgium, Ireland, Luxembourg and Austria that made on average less than the prescribed contact attempts to non-contacts did not achieve the target non-contact rate of three per cent. A detailed analysis of all call records reveals that in Ireland, Germany and Belgium most interviewers did make a “minimum of four contact attempts”, and that in these cases it was a small number of core interviewers who did not make the prescribed number of calls and recorded high non-contact rates. In Ireland and Germany, for instance, five per cent of the interviewers were responsible for approximately 50 per cent of all non-contacts who did not receive four contact attempts, while in Belgium, five per cent of interviewers were responsible for 67 per cent of all non-contacts who received less than four contact attempts. In these countries, the contact rate could almost certainly have been raised by a closer monitoring of interviewers and by reissuing assignments to other interviewers. In Luxembourg on the other hand, a large majority of interviewers routinely broke the “minimum of four attempts” rule for at least some of their potential respondents, suggesting a more structural problem. It is likely that interviewers there were not fully aware of the fact that the prescribed guidelines were mandatory.
Jowell-Chapter-06.qxd
3/9/2007
6:44 PM
Page 123
Understanding and improving response rates
123
Countries that complied with the “minimum of four call attempts” rule generally reached the target non-contact rate of three per cent, or at least came close to it. Even so, the relationship between the average number of contact attempts and the achieved non-contact rates is not clear-cut. In the UK and Spain, for instance, although more contact attempts were made than were strictly required, the target rate remained out of reach. And in Israel, although only 2.8 contact attempts were made before cases were abandoned, a non-contact rate of three per cent was nonetheless achieved. Differences in the timing of calls might play an intermediate role in the relationship between the number of call attempts and the achieved non-contact rate. Countries with large hard-to-reach populations will tend to have to make more contact attempts to obtain the same contact rate. Similarly, in countries where interviewers routinely make more evening calls, fewer contact attempts will be needed to achieve the same non-contact rates (Purdon et al., 1999).
Figure 6.1 Scatterplot of average number of contact attempts versus achieved non-contact rate
Jowell-Chapter-06.qxd
3/9/2007
124
6:44 PM
Page 124
MEASURING ATTITUDES CROSS-NATIONALLY
Contactability Following Groves and Couper (1998, p.80) we define “contactability” as the propensity for a household to be contacted by an interviewer at any given point in time. Owing to the fact that ‘at home time’ tends to be different in different countries, we should expect some populations to be harder to contact than others. To verify this, we examined the probability of contacting a household at the first call attempt at different times of the day and different days of the week. And, as Figure 6.2 shows, the probability of contacting a household during a weekday morning or afternoon is relatively low in Spain, the UK, the Netherlands, Switzerland and Portugal but relatively high in Poland, Israel, and Italy. So it does appear that in certain countries interviewers have to make more contact attempts to reach their sample units than in other countries. Indeed, these figures might partially explain why Israeli interviewers on average only have to make 2.8 contact attempts to reach their target non-contact rate, while British interviewers on average had to make close to nine attempts to achieve a similar rate. Figure 6.2 attempt
Probability of contact at first call attempt by timing of first call
Figure 6.2 also illustrates that, in line with previous research, evening and weekend contact attempts are in general more productive than weekday morning or afternoon attempts. In all countries, except for Italy and Poland, we found a significant relationship (p<.05) between the probability of contact at first contact attempt and the timing of this attempt. Certainly in the UK, Ireland, Belgium, Portugal and Switzerland evening
Jowell-Chapter-06.qxd
3/9/2007
6:44 PM
Page 125
Understanding and improving response rates
125
attempts are much more productive than morning or afternoon attempts. On the other hand, in Poland, Israel, Greece and Italy we found hardly any relationship between the timing of contact attempts and the probability of making a successful contact. It seems that in general the (relative) benefits of evening and weekend calls are highest in countries where households are rather hard to reach on a weekday morning or afternoon. This suggests that survey organisations in these countries can compensate less favourable at-home patterns by adapting calling strategies towards making more contact attempts in the evening and at weekends. So the likelihood of successful contacts varies with the timing of the contact attempt (Weeks et al., 1980). Figure 6.3 shows the percentage of contact attempts that were made on a weekday morning or afternoon for the first three contact attempts. The percentages at each visit in this figure are based on those households where no contact was made at the previous contact attempt. The figure shows that interviewers in the UK, Ireland, Spain and Italy seem to avoid evening and weekend visits at the first two contact attempts. Indeed in the UK 84 per cent of all first contact attempts were made on a weekday morning or afternoon; in Ireland the proportion was 75 per cent, in Spain 71 per cent, and in Italy 68 per cent. Yet, as noted, with the exception of Italy these are the very countries where the benefits of evening and weekend visits are the greatest. This implies that British, Irish and Spanish interviewers might improve the efficiency of their work by transferring more of their calls to the evenings and weekends. Figure 6.3 Percentage of the first three contact attempts made on a weekday morning/afternoon
Jowell-Chapter-06.qxd
126
3/9/2007
6:44 PM
Page 126
MEASURING ATTITUDES CROSS-NATIONALLY
The lowest percentage of morning and afternoon attempts was found in Israel (32 per cent of all first call attempts) and Portugal (46 per cent). This seems like the most appropriate strategy given our finding above that Portuguese households are rather hard to reach in the morning and afternoon. Survey organisations in Portugal appear to have tried successfully to compensate for this by making more evening and weekend attempts. Country differences in refusal conversion Survey researchers use many techniques to increase survey participation. One of these techniques is refusal conversion, which involves re-approaching initially reluctant respondents to persuade them to reconsider participating in the survey. Much of the success of refusal conversion procedures depends on the “softness” of the initial refusal. Refusals often occur due to the circumstances of the visit. A person may be busy when the interviewer arrives, or may be feeling unwell or irritable, and therefore refuses. In many cases, a later visit at a better time may generate a more positive response. In any event, it appears that consistent die-hard refusers are probably a rather small part of the total group of refusers. The effectiveness of refusal conversion procedures in the ESS is studied by examining the variety of approaches to refusal conversion in the ESS and assessing their respective impact on response rates. The ESS specification urges countries to reissue all “soft” refusals and indeed many apparently “hard” refusals too as a means of increasing participation. The suggestion is that a senior interviewer should ideally undertake such work. Given the vagueness of this specification, and the cultural specificity of the task, we anticipated quite a bit of variation in practice between countries. In any case, differences between countries in their pre-conversion response rates might mean that efforts vary. Indeed, the specification allows for no refusal conversion in countries where the minimum target response rate (70 per cent) has already been realised.6 We realise, of course, that refusal conversion is both tricky and labourintensive. Whichever practice is implemented, the procedure creates practical problems and some survey organisations do not seem to have either the means or the capacity to organise effective refusal conversion routinely (Loosveldt et al., 2003). Figure 6.4 shows the percentages of eligible sample units that explicitly refused to participate at least once, and divides them into three categories: those who were not re-approached; those who were re-approached but not converted; and those who were re-approached and converted. In line with our expectations we see that refusal conversion efforts do vary significantly by country. 6
However, as we shall report later in this chapter, information about converted respondents may in fact be of major importance in assessing non-response bias. This applies equally to countries with high initial response rates.
Jowell-Chapter-06.qxd
3/9/2007
6:44 PM
Page 127
Understanding and improving response rates
127
Figure 6.4 Percentage of eligible sample units that refused at least once, broken down into those not re-approached, those re-approached but not converted and those successfully converted
Thus, in the Netherlands, the UK and Switzerland respectively, 88 per cent, 84 per cent and 77 per cent of all refusals were re-approached. Major conversion efforts were also made in Greece, Finland and Italy with respectively 54 per cent, 50 per cent and 44 per cent of all refusals being re-approached. But the majority of countries, including Spain, Slovenia, Poland, Belgium, Austria and Israel made only moderate efforts at conversion, each re-approaching between 20 and 34 per cent of all refusals. Bringing up the rear, Ireland, Hungary and Luxembourg made barely any refusal conversion efforts, while in Germany and Portugal the extent of their efforts are not clear from their contact form datasets. The success rate of conversions was highest in Austria (47 per cent), followed by the Netherlands (39 per cent), Belgium (33 per cent) and Slovenia (32 per cent). For Greece, Finland, Poland and Israel, conversion success rates lie between 20 and 30 per cent, while in the UK and Spain conversion rates were only around 15 per cent. The lowest conversion success rates were achieved in Switzerland and Italy, each with rates of around five per cent.
Jowell-Chapter-06.qxd
128
3/9/2007
6:44 PM
Page 128
MEASURING ATTITUDES CROSS-NATIONALLY
Table 6.2 Distribution of co-operative and reluctant respondents (horizontal percentages) Kinds of respondents Country1 NL DE CH GB AT DK SI PT FI ES BE GR PL IL IT IE LU HU CZ Total
% Co-operative
% Converted ‘soft refusal’
% Converted ‘hard refusal’
Total (100%)
79.6 83.0 92.6 94.3 94.9 95.2 96.4 96.6 96.9 97.2 97.6 97.7 98.4 98.5 98.8 99.7 99.7 99.8 100.0 95.1
10.7 10.5 3.5 3.6 4.7 4.9 3.2 2.4 3.0 2.2 2.2 2.0 1.1 1.2 1.2 0.0 0.3 0.2 0.0 3.4
9.7 6.6 3.9 2.1 0.4 0.0 0.4 1.0 0.2 0.6 0.2 0.2 0.5 0.3 0.0 0.3 0.0 0.0 0.0 1.6
2361 2918 2004 2052 2244 1468 1518 1511 1999 1727 1895 2564 2109 2493 1205 2020 1547 1684 1352 34667
Notes: 1No information about types of refusal in Sweden, France and Norway Source: Contact forms data file
It is risky, however, to draw too much from these country differences in conversion success rates, because they may be attributable to different strategies. For instance, some countries (e.g. Belgium) focused their efforts on a small and probably “soft” group of refusals, while others (e.g. the Netherlands) reapproached almost all refusals, thus making high conversion rates more difficult to obtain. But the result in the Netherlands is quite remarkable in comparison with other countries that reissued the majority of their refusals, such as Switzerland and the UK. Their success rate was 39 per cent, compared to 15 per cent in the UK and only five per cent in Switzerland. Inspection of the National Technical Summaries reveals, however, that the Dutch survey organisation implemented a range of special refusal conversion strategies, which almost certainly had an influence. For instance, they sent letters to refusers half way through fieldwork ‘encouraging’ reluctant sample members with the offer of incentives of up to a 5 and inviting them to participate in a quiz with monetary prizes. These measures (plus special interviewer training) might well explain their success.
Jowell-Chapter-06.qxd
3/9/2007
6:44 PM
Page 129
Understanding and improving response rates
129
In general, however, the impact of refusal conversion strategies on overall response rates is rather small – usually around one to three percentage points. Indeed, in some countries, such as Switzerland and the UK, both of which invested significant efforts in refusal conversion, the returns were very disappointing. In contrast, the impact in the Netherlands of their innovative efforts was much more impressive, increasing overall response rates there from 53 per cent to 68 per cent. It seems, then, that even in countries with less favourable survey climates, significant efforts tend to pay off, challenging the widely held view that response rates are necessarily in a continual and irredeemable decline. Differentiation of respondents according to readiness to co-operate Data in the contact forms enable us to identify three distinct types of respondents (see Table 6.2): co-operative respondents, converted ‘soft refusals’, and converted ‘hard refusals’. Co-operative respondents are respondents who agree to participate at first contact, suggesting no trace of reluctance. Converted ‘soft refusals’ are initially reluctant but decided to co-operate after a single additional attempt. Converted ‘hard refusals’ are reluctant respondents who decide to participate only after several additional attempts, sometimes after special incentives have been deployed (as in the Netherlands). Note that the figures in Table 6.2 apply only to respondents who finally cooperated, not to all selected sample units. So the number of co-operative respondents is by definition 100 per cent in a country (such as the Czech Republic where no attempt was made to re-approach refusals). Similarly, converted ‘soft’ and converted ‘hard refusals’ apply only to reluctant respondents who were re-approached and who were finally co-operative. The highest percentage of converted refusals is observed in the Netherlands, where about 10 per cent of all respondents are converted ‘soft refusals’, and another 10 per cent converted ‘hard refusals’. An increasingly important question remains. How much impact do such refusal conversion efforts have on the final survey estimates in surveys such as the ESS? It is to this question that we now turn. Estimation of non-response bias We have tried to estimate the direction of bias related to non-response, using information on types of respondent derived from the contact forms dataset (Billiet and Philippens, 2004; Billiet et al., 2005). We classified respondents into ‘co-operative’ and ‘reluctant’ categories according to whether they agreed
Jowell-Chapter-06.qxd
130
3/9/2007
6:44 PM
Page 130
MEASURING ATTITUDES CROSS-NATIONALLY
to an interview at first contact or only after conversion. Having merged the data file of the completed interviews with information obtained from the contact forms, the analysis was based on the premise that the views and characteristics of reluctant respondents would more closely resemble the views and characteristics of those who refused finally. There is no clear consensus in the literature as to whether this is or is not the case. Some studies support this premise (e.g. Stoop, 2004; Voogt, 2004), while others either leave room for doubt or reject the premise altogether (e.g. Lin and Schaeffer, 1995; Stoop, 2005; Teitler et al., 2003). Additional research should cast further light on this debate. The analysis was confined to a subset of countries in which there was an adequate number of converted respondents (see Table 6.2). Because of the small cells, we could not distinguish in this analysis between converted ‘soft’ and ‘hard refusals’, except in the Netherlands and Germany, where the proportion of converted respondents was over 15 per cent (or more than 450 cases). In another three countries (the UK, Switzerland and Austria), the proportion of converted respondents was between five and seven per cent (more than 100 observations), but everywhere else the numbers were too small to produce anything approaching reliable results. We studied four categories of substantive variables: data referring to sociodemographic variables, to social integration, to political involvement and to attitudes towards immigrants (Billiet and Philippens, 2004). Based on our previous research we anticipated a relationship between these variables and nonresponse, because people who are less integrated into their society, or participate less in politics, or are suspicious of strangers are all less likely to participate in surveys (Brehm, 1993; Groves and Couper, 1998; Voogt and Saris, 2003; Voogt, 2004; Loosveldt and Carton, 2002). The first step in the analysis was to look at differences between co-operative and reluctant respondents in socio-demographic variables where, according to prevailing theories about non-response, some bias was anticipated. We examined respondents’ ages, their education level, whether they lived in an urban or rural environment, and their participation in voluntary associations. • We anticipated that a greater proportion of converted respondents would have lower levels of education. This was true in the UK, Germany, and the Netherlands, but only in Germany and the Netherlands was the difference significant (p < 0.05). But it was not true in Switzerland and Austria, where no differences were found. • We anticipated proportionally more city and suburb dwellers among the converted refusals, but this was only true for Germany. • We anticipated the proportion of respondents participating in voluntary associations to be lower among the reluctant respondents, but this was only significant in Austria.
Jowell-Chapter-06.qxd
3/9/2007
6:44 PM
Page 131
Understanding and improving response rates
131
• We anticipated a higher proportion of reluctant respondents among older cohorts, but differences according to mean age were statistically significant at the 0.05 level only in Germany. In sum, our expectations about the association between these socio-demographic variables and particular types of respondent were not met. The direction and strength of the relationships were found to differ by country, possibly partly accounted for by the small numbers of converted refusals, but at least to some extent because the relationship between these variables and reluctance to co-operate in surveys is not identical across nations. The second step was to look at the mean scores of five attitudinal variables and compare the scores of cooperative and reluctant respondents. The variables were: • • • • •
trust in politics political participation interest in politics social trust perceived threat from immigrants.
These variables are all tested attitudinal scales based on sets of indicators rather than individual items (Billiet and Philippens, 2004, p.15). They are all indicators of integration in society. We anticipated significantly (p < 0.05) lower mean scores among reluctant respondents for the first four variables (political trust, participation, political interest and social trust), and a significantly higher mean score for the last variable on the threat from immigrants. As to the last hypothesis we did indeed find the expected differences in mean scores for perceived threat from immigrants, and – except in Austria – they were significant. But on the other four variables, we found a somewhat more mixed picture. Even so, stable and significant differences in the expected direction were discovered in Germany and the Netherlands on the variables dealing with trust and participation in politics. And in the UK and Switzerland, although the same differences were found, they were not significant (probably because of the smaller number of cases). So, these findings support the thesis that lower levels of social integration do indeed hinder participation in surveys (Brehm, 1993; Groves and Couper, 1998; Voogt, 2004). They also mean that survey estimates of the perceived threat from immigrants, or of trust in politics, or of levels of political participation, are likely to be biased by high levels of non-response. In the case of immigrants, the perceived threat will tend to be underestimated, while levels of trust in politics and political participation will tend to be overestimated. Substantive analysts are usually interested less in the univariate distributions of different variables than in the relationships between variables within
Jowell-Chapter-06.qxd
3/9/2007
132
6:44 PM
Page 132
MEASURING ATTITUDES CROSS-NATIONALLY
the context of explanatory models. For that reason, our third step was to analyse the relationship between each of the attitudinal variables (dependent variables) and the type of respondent (co-operative/reluctant) within regression models that also included the relevant socio-demographic variables. We conducted this analysis on data from the Netherlands and Germany where the sample sizes were sufficiently large. At issue was whether the relationship between survey participation on the one hand and social attitudes (such as the perceived threat from immigrants, trust in politics and political participation) on the other remained significant after controlling for socio-demographic differences. And the answer was that all the differences became weaker in Germany, and all but one disappeared in the Netherlands. So the clear-cut relationship between reluctant respondents and expected substantive outcomes that we thought was established turned out to be rather more blurred than it had initially appeared to be. Moreover, the relationship is not uniform across countries (Billiet et al., 2005). Although evidence of non-response bias is still apparent, we cannot yet tell how it works across a wider range of variables and larger number of countries. This is a priority for future research. Conclusion Our work suggests that what is considered to be an “optimal” fieldwork strategy in one country might turn out to be sub-optimal in another. The achievement of an equivalent fieldwork strategy in cross-national studies will in fact have to involve considerable national variation within an agreed framework. Consider, for instance, the timing of contact attempts. In some countries (such as Portugal, Great Britain, Spain and Ireland) evening visits are clearly the most productive. If they wish to maximise the efficiency of contact attempts, they should clearly increase their proportion of evening visits. For other countries, timing of visits has less effect on the probability of making contact with potential respondents. In these countries, the disadvantages of boosting the proportion of evening visits – such as reducing the length of the working day and increasing costs – are likely to outweigh the advantages. In any case, as Purdon et al. (1999) point out, calling strategies also have to be sensitive to the wishes and concerns of interviewers. As noted, the results of monitoring the survey process should inform national decisions, leading to actions that are likely to improve fieldwork procedures. The non-response analysis we have undertaken so far has already yielded some helpful starting points for national action, such as the appropriate use of refusal conversion procedures. These procedures were especially successful in the
Jowell-Chapter-06.qxd
3/9/2007
6:44 PM
Page 133
Understanding and improving response rates
133
Netherlands where response rates were raised from 53 to 68 per cent. They included special training for refusal conversion, as well as incentives for both respondents and interviewers (Stoop, 2004, 2005). Other countries, especially those with only moderate or low response rates, have much to learn from success stories like these. This has already happened in Round 2 when Switzerland, the Czech Republic and Luxembourg significantly increased their response rates. We have also learned that by relating information from contact forms about types of respondent to substantive data about respondent attitudes and characteristics, we can provide helpful insights into the extent and nature of non-response bias. But such analyses can be undertaken only if the number of converted refusals is large enough. Our analyses have confirmed that differences between co-operative and reluctant respondents do tend to bias the distributions of answers on variables such as trust in politics, participation and perceived threat from immigrants. These effects were all in the expected direction, suggesting that survey participation is related to the social participation of the respondents. But this effect weakens considerably when socio-demographic variables are controlled for. Detailed, multivariate analysis across a wider range of ESS countries is needed in future rounds to produce a clear picture of how non- response has an impact on bias. Once this analysis has taken place, we will be in a better position to use information from the contact forms (such as individual and neighbourhood characteristics) of each selected sample unit to analyse and adjust for bias that stems from non-response. The analyses we have presented in this chapter are merely a first step in improving survey procedures in the ESS and other similar projects. They cannot yet provide definitive answers. Improving national fieldwork strategies requires us to focus on national data and existing national procedures, where local specialists will add useful insights. This is likely to be a continuing iterative process that develops with each round and should be aided considerably by the opportunities for in-depth discussions offered by the ESS’s new infrastructure grant. References Billiet, J. and Philippens, M. (2004), ‘Data Quality Assessment in ESS Round 1: Between Wishes and Reality’, Paper presented at the Sixth International Conference on Social Science Methodology: Recent Developments and Applications in Social Research Methodology, RC33 Logic and Methodology, 17–20 August 2004, Amsterdam.
Jowell-Chapter-06.qxd
134
3/9/2007
6:44 PM
Page 134
MEASURING ATTITUDES CROSS-NATIONALLY
Billiet, J., Philippens, M., Fitzgerald, R. and Stoop, I. (2005), ‘Estimation of Response Bias in ESS Round 1: Using Information from Reluctant Respondents’, Paper presented at the 60th Annual AAPOR Conference, 12–15 May 2005, Miami. Brehm, J. (1993), The Phantom Respondents; Opinion Surveys and Political Representation. Ann Arbor: University of Michigan Press. Couper, M. and De Leeuw, E. (2003), ‘Non-response in Cross-Cultural and CrossNational Surveys in: J.A. Harkness, F.J.R. Van de Vijver and P. Mohler (eds), Cross-Cultural Survey Methods, New Jersey: John Wiley, pp.157–178. De Heer, W. (1999), ‘International Response Trends: Results of an International Survey’, Journal of Official Statistics, 15 (2), pp.129–142. De Leeuw, E. and De Heer, W. (2002), ‘Trends in Household Survey Non-response: A Longitudinal and International Comparison’ in: R.M. Groves, D.A. Dilman, J.L. Eltinge and R.J.A. Little (eds), Survey Non-response, New York: John Wiley, pp.41–54. Devacht, S., Loosveldt, G., Philippens, M. and Billiet, J. (2003), Procesevaluatie van de dataverzameling in de eerste ronde van het European Social Survey, Onderzoeksverslag van het Departement Sociologie. Afdeling Dataverzameling en analyse. DA/2003-33. Groves, R. and Couper, M. (1998), Non-response in Household Interview Surveys, New York: John Wiley. Hox, J. and De Leeuw, E. (2002), ‘The Influence of Interviewers’ Attitude and Behavior on Household Survey Non-response: An International Comparison’ in: R.M. Groves, D.A. Dillman, J.L. Eltinge and R.J.A. Little (eds), Survey Nonresponse, New York: Wiley, pp.103–120. Jowell, R. (1998), ‘How Comparative is Comparative Research?’, American Behavioral Scientist, 42 (2), pp.168–177. Lin, F. and Schaeffer N.C. (1995), ‘Using Survey Participation to Estimate the Impact of Non-participation’, Public Opinion Quarterly, 59, pp.236–258. Loosveldt, G. and Carton, A. (2002), ‘Utilitarian, Individualism and Panel Nonresponse’, International Journal of Public Opinion Research, 14, pp.428–438. Loosveldt, G., Carton, A. and Billiet, J. (2004), ‘Assessment of Survey Data Quality: A Pragmatic Approach Focused on Interviewer Tasks’, International Journal of Market Research, 46, pp. 65–82. Loosveldt, G., Philippens, M., Stoop, I. and Billiet, J. (2003), ‘Reluctance and Refusal Conversion Procedures in the ESS’, Paper presented at the International Workshop on Household Survey Non-response, 22–24 September 2003, Leuven, Belgium. Lynn, P. (2003), ‘Developing Quality Standards for Cross-National Survey Research: Five Approaches’, International Journal of Social Research Methodology, 6 (4), pp.323–336. Lynn, P., Beerten, R., Laiho, J. and Martin, J. (2001), Recommended Standard Final Outcome Categories and Standard Definitions of Response Rate for Social Surveys, ISER Working Paper Number 2001–23.
Jowell-Chapter-06.qxd
3/9/2007
6:44 PM
Page 135
Understanding and improving response rates
135
Park, A. and Jowell, R. (1997), Consistencies and Differences in a Cross-National Survey. The International Social Survey Programme, Cologne: ZA. Philippens, M. and Billiet, J. (2004), ‘Monitoring and Evaluating Non-response Issues and Fieldwork Efforts in the European Social Survey’, Paper presented at the European Conference on Quality and Methodology in Official Statistics, 19 May 2004, Mainz. Purdon, S, Campanelli, P. and Sturgis, P. (1999), ‘Interviewer’s Calling Strategies on Face-to-Face Interview Surveys’, Journal of Official Statistics, 15 (2), pp.199–216. Stoop, I. (2004), Surveying Nonrespondents, Field Methods, 16 (1), pp.23–54. Stoop, I. (2005), The Hunt for the Last Respondent, The Hague: Social and Cultural Planning Office. Stoop, I., Devacht S., Loosveldt, G., Billiet, J. and Philippens, M. (2003), ‘Developing a Uniform Contact Description Form’, Paper presented at the 14th International Workshop on Household Survey Non-response, 22–24 September 2003, Leuven, Belgium. Stoop, I. and Philippens, M. (2004), ‘Non-respons in Nederland: van zwart schaap naar witte raaf’, in: Hollandse taferelen, Den Haag: SCP, pp.41–46. Teitler, J.O., Reichman, N.E. Sprachman, S. (2003), ‘Costs and Benefits of Improving Response Rates for a Hard-to-Reach Population’, Public Opinion Quarterly, 26, pp.126–138. The American Association for Public Opinion Research (AAPOR) (2004), Standard Definitions: Final Dispositions of Case Codes and Outcome Rates for Surveys, 3rd edition, Lenexa, KS: AAPOR. Voogt, R. (2004), ‘“I am not Interested”, Non-response Bias, Response Bias and Stimulus Effects in Election Research’, PhD Thesis, University of Amsterdam. Voogt, R. and Saris, W. (2003), ‘Political Interest: The Key to Correction Methods for Non-response Bias in Election Studies’, Paper presented at the 14th International Workshop on Household Survey Non-response, 22–24 September 2003, Leuven, Belgium.
Jowell-Chapter-06.qxd
3/9/2007
6:44 PM
136
Page 136
MEASURING ATTITUDES CROSS-NATIONALLY
Appendix: Fieldwork in ESS Round 1
Country Austria Belgium
Survey organisation
Main Type questionnaire Fieldwork of survey period organisation1 mode
IPR – Sozialforschung ISPO, Department of Sociology, Leuven; CLEO – Univ. of Liege STEM, s.r.o.
C
PAPI
U
PAPI
C
PAPI
SFI-Survey
C
CAPI
Finland
Statistics Finland
S
CAPI
France
Institut de Sondage Lavialle Infas Institut für angewandte Sozialwissenschaft GmbH MRB Hellas; OPINION SA TÁRKI Social Research Centre Economic and Social Research Institute B.I. Cohen Institute for Public Opinion Research, Tel-Aviv University TNS Abacus
C
CAPI
Czech Republic Denmark
Germany
Greece
CAPI
Number of realised interviews
02/02/2003– 2257 30/09/2003 01/10/2002 1899 30/04/2003
24/11/2002– 09/03/2003 28/10/2002– 19/06/2003 09/09/2002– 10/12/2002– 15/09/2003– 15/12/2003 20/11/2002– 16/05/2003
1360
29/01/2003– 15/03/2003 29/10/2002– 26/11/2002 11/12/2002– 12/04/2003 15/10/2002– 15/01/2003
2566
13/01/2003– 30/06/2003 14/04/2003– 14/08/2003 01/09/2002– 24/02/2003 16/09/2002– 17/01/2003
1207
1506 2000 1503 2919
C
PAPI
C
PAPI
N
PAPI
U
PAPI
C
CAPI
Luxembourg CEPS/INSTEAD
N
PAPI
Netherlands Gfk Panel Services
C
CAPI
Norway
Statistics Norway – Division for sample surveys
S
CAPI
Poland
Center for Social Survey Research, Institute of Philosophy and Sociology, Polish Academy of Sciences TNS Euroteste
U
PAPI
30/09/2002– 2110 19/12/2002
C
PAPI
26/09/2002– 1511 20/01/2003
Hungary Ireland
Israel
Italy
Portugal
1685 2046
2499
1552 2364 2036
(Continued)
Jowell-Chapter-06.qxd
3/9/2007
6:44 PM
Page 137
Understanding and improving response rates
137
Appendix: Fieldwork in ESS Round 1 (Continued)
Country Slovenia
Spain
Survey organisation
Main Type questionnaire Fieldwork of survey period organisation1 mode
University of U Ljubljana, Faculty of Social Sciences, Public Opinion and Mass Communication Research Center DEMOSCOPIA C
Sweden
PAPI
17/10/2002– 1519 30/11/2002
PAPI
19/11/2002– 20/02/2003 23/09/2002– 20/12/2002 09/09/2002– 08/02/2003 24/09/2002– 04/02/2003
Statistics Sweden, SCB Switzerland M.I.S. Trend SA
S
CAPI
C
CAPI
United Kingdom
N
CAPI
National Centre for Social Research (NatCen); Central Survey Unit of the Northern Ireland Statistics and Research Agency
Number of realised interviews
1729 1999 2040 2052
Notes: 1C = Commercial survey agency; N = Non-profit survey organisation; S = Statistical agency; U = University institute
Jowell-Chapter-06.qxd
3/9/2007
6:44 PM
Page 138
Jowell-Chapter-07.qxd
7
3/9/2007
5:03 PM
Page 139
Free and immediate access to data Kirstine Kolsrud, Knut Kalgraff Skjåk and Bjørn Henrichsen∗
Introduction As quantitative social science expands, a serious attempt is at last being made to reduce the barriers between the producers and users of large social science investments. Many large-scale multi-country surveys are now attempting to reduce the gap, and this has been a defining feature of the ESS. Historically, however, comparative social science research has been hampered by fragmentation. Data, documentation, metadata, publications and knowledge have rarely been collated in a central hub and little has been done to overcome the difficulties of language, culture and institutional barriers. Now, with the growing maturity of social science, the scientific principle of allowing others to replicate the methods of any given project not only is becoming more possible but is to be encouraged and facilitated. Cutting edge data access arrangements are central to allowing that process to take place. Moreover, since large-scale multi-country surveys are, in effect, public goods, often funded from the tax base at considerable cost, they need to be models of transparency. Their data must be quickly available and subject to scrutiny. This means overcoming longstanding barriers of propriety and technology that have always frustrated both users and funders. We will show in this chapter that these old rules and habits are at last changing. Following changing practice in the USA, where principal investigators of large research investments are increasingly required to *
Kirstine Kolsrud is a Senior Adviser, Knut Kalgraff Skjåk is Head of Department and Bjørn Henrichsen is Director at Norwegian Social Science Data Services.
Jowell-Chapter-07.qxd
140
3/9/2007
5:03 PM
Page 140
MEASURING ATTITUDES CROSS-NATIONALLY
give up prior rights of access to the data, the ESS provides access to the whole social science community simultaneously. So, for instance, neither the principal investigators nor the question module design teams in the ESS have any guarantee of being able to ‘publish first’. On the contrary, ‘ownership’ of the data is effectively distributed between all users. Indeed, as we shall show, ESS practice goes even further. All data and supporting documentation are released immediately in an easy-to-use form directly via the worldwide web, allowing analysts to begin serious work on the data without any of the usual technically driven and administrative delays. Thus, potential data users are immediately equipped with the easy-to-use tools they require to produce high-quality analyses. Other chapters in this book highlight the numerous challenges to measurement equivalence faced by a cross-national survey like the ESS. So we have to make sure that data users have sufficient information about all stages of the research process to assess whether the differences they find between nations are real or artefactual. Ever since the ESS was conceived, the principle of free and immediate access to the data for all has been an integral part of the design (ESF, 1999). As the official data archive and data service institution for the ESS, the Norwegian Social Science Data Services (NSD) works in close collaboration with the rest of the ESS’s Central Coordinating Team (CCT). Its particular aim is to enhance communication, data sharing and collaboration between spatially dispersed but scientifically related communities through the use of cutting-edge information technologies. Its ultimate aim is to empower data users by providing them with the tools they need to analyse and interpret an inherently complicated cross-national time series. We identify below what we consider to be the most common barriers to accessing data and documentation and show how the ESS has managed to overcome many of these longstanding obstacles, setting new standards of data access. ESS data and documentation have, of course, already been distributed across the world. And we shall show how a combination of modern technology and meticulous documentation has enabled the demands of many disparate users to be met. We provide a guided tour to the processes and technology that have made such advances possible, many of which are now being adopted by similar projects elsewhere. Data access barriers As shown in Table 7.1, comparative surveys face three main types of barrier to free and comprehensive access to data, each of which faced the ESS in its planning stages.
Jowell-Chapter-07.qxd
3/9/2007
5:03 PM
Page 141
Free and immediate access to data
Table 7.1
141
Barriers to free and comprehensive access to data
Legal and Institutional
Privacy/confidentiality; ownership rights/embargoes; pricing systems
Organisation of data and documentation Lack of standardisation of variables; poor quality and lack of standardisation of meta data; lack of transparency; out-of-date information systems Cultural Customs/habits; attitudes
Legal hindrances, stemming from privacy concerns, are often cited as the primary obstacles to gaining access to individual-level micro data. In practice, however, legal barriers all too often become blurred with cultural barriers, which combine to produce apparently impenetrable obstacles to free access. For instance, data producers often make their micro data available only in what are referred to as ‘safe settings’, instead of distributing them to users to quarry in their own time (Dale and Trivellato, 2002). Although there may on occasions be a case for such restrictions – such as when individuals could too easily be identified in the data – in most general population surveys with anonymised datasets, this is surely overkill. Or, as Dale and Trivellato say, “variability in these release policies are only moderately associated with the micro data sets involved. What it mainly reflects is differences in attitudes” (p.8). And when dealing with anonymised data sets, strict data dissemination rules “can hardly be justified only on confidentiality grounds” (p.13). Distribution practices are also to some extent nation- or institution-specific. Some principal investigators, presumably with the support of their funders, often claim the right to impose severe constraints on access to data by others, at least within a reasonable time period or at a reasonable cost. This is even true of some projects supported from the public purse, where it might have been assumed that the data would be considered to be a public good. Even statistical offices, for instance, sometimes require considerable fees from ‘outsiders’ who wish to use ‘their’ data, as if the data were a commodity produced for their purposes alone. The notion of intellectual ownership of, or copyright to, data is also often an institutionally or culturally defined concept. Sometimes data are embargoed until after an article on the findings has passed through the review process of a scientific journal. Our view is that if large-scale, publicly funded research projects are to justify their funding, they must do so on the basis of free and equal access to the data for all. In the absence of this, the analytical power of the dataset potentially is diminished, and in the long run, support for such exercises is also bound to diminish. So from the start – legal and institutional barriers notwithstanding – ESS data have always been considered and treated as a public good. Thus the data are made simultaneously available to all, regardless of their prior involvement in the
Jowell-Chapter-07.qxd
3/9/2007
142
5:03 PM
Page 142
MEASURING ATTITUDES CROSS-NATIONALLY
exercise, their affiliation or their nation. This principle of simultaneous and equal access applies as much to the Central Coordinating Team (CCT), the question module design teams and the National Coordinators as to the casual user. In other words, we recognise above all that the intellectual content of the ESS is a collective product, drawing implicitly or explicitly on the past and present work of many other scholars in Europe and beyond. Naturally, we are aided by the fact that all our data are in an anonymised form. Indeed, in accordance with data protection regulations in each participating country, only anonymised data is deposited at the archive. Thus it is the data depositor’s obligation to ensure that their own national data conforms to their country’s privacy laws. This means that on occasions analysis opportunities are limited, such as for scholars who wish to link respondents’ with detailed local area information. But the advantages of this approach seem to us to far outweigh the disadvantages. But overcoming the legal, institutional and cultural barriers to data access is only part of the story. An equally important challenge has been to overcome the barriers of often poor organisation and standardisation of data. This can take various guises such as different variable names or data formats for the same variables, poor quality, lack of standardisation or even the complete absence of meta data, poor integration between data and meta data, and the poor upkeep or lack of transparency of supporting information systems. The ESS was specifically designed to overcome these barriers in an attempt not only to enhance the comparability of variables, but also to enable the adequate integration of high-quality well-harmonised data and meta data. Since all aspects of the ESS are driven by a determination to increase the transparency of both process and outcomes, these principles are reflected in the final dataset and accompanying documentation. Standardising the production of data and meta data As noted, Round 1 of the ESS involved 22 participating countries, each of which was required to implement the survey in an equivalent way. Significant thought therefore went not only into the front end of the process, involving standardised design and data collection, but also into how the subsequent data would be organised, standardised and made available to the end user. We were determined from the start that the dataset itself and its accompanying documentation should both be made available in an unusually accessible and helpful form.
The data As a result of its central funding and organisational structure, the ESS was perhaps more able than its predecessor multinational projects to insist on a set
Jowell-Chapter-07.qxd
3/9/2007
5:03 PM
Page 143
Free and immediate access to data
143
of directly equivalent processes and measures. But in any case, the days when data users would first have to consult reams of documentation and re-code variables before they could start their analyses are doubtless a thing of the past. A combination of input harmonisation and adherence to internationally accredited standards must become the new norm (Kolsrud and Skjåk, 2004). As in all other areas of the ESS, the desire to achieve equivalence has had to be translated into clear data specifications and documentation standards – a topic we return to later. Admittedly, advances in technology have been a key factor in enabling the ESS to achieve its ambitious aims – as evidenced in the ESS’s publicly accessible data website. But of similar importance has been the creation by NSD of an ESS ‘working’ website for the prior sole use of national teams and the CCT (the Archive website). This website includes all services necessary to plan and produce the standardised and harmonised cross-national data files that find their way into the publicly accessible data website. It also provides all country teams with the definitions and other tools they need to deposit their data in a standard and orderly way (see Figure 7.1). Figure 7.1
The ESS ‘working’ website
The Archive website occupies a central stage in the processing of ESS data and documentation. Each national team downloads the standardised specifications and processing tools (production aids) and subsequently deposits its data files and documentation (products) to the site. After NSD has processed and quality-controlled these files, each national team then downloads the now harmonised national data files together with the programs that have been used by the Archive as a means of validation.
Jowell-Chapter-07.qxd
3/9/2007
144
5:03 PM
Page 144
MEASURING ATTITUDES CROSS-NATIONALLY
The ESS Data Protocol document (see Figure 7.2), available from the Archive website, is pivotal in the achievement of cross-national uniformity in data delivery. It gives specifications for the coding of data as well as the production and delivery of data files and other electronic deliverables. Its largest section contains the detailed coding plan, which defines the rules for the definition of different categories of answer – whether numeric or alphanumeric codes are employed – and the appropriate use of hitherto nondefined codes such as ‘not applicable’. The coding plan also includes detailed routing instructions consistent with the instructions given in the source questionnaire, which in Round 2 was supplemented by a flowchart. Figure 7.2
The ESS Data Protocol document
The ESS variable names, labels and codes incorporated within the coding plan are also copied into programs and “empty” data files (dictionaries) in SAS and SPSS. These programs and data files can thus be downloaded direct from the website and applied by the national teams in building CAPI program, data entry programs and in the building of national data files (see Figure 7.3).
Jowell-Chapter-07.qxd
3/9/2007
5:03 PM
Page 145
Free and immediate access to data
Figure 7.3
145
SPSS data dictionary
The Data Protocol document also gives an overview of the files and documents that national teams are required to deliver to the archive. As noted, they are asked to pay particular attention to the anonymisation in advance of delivery. And it deals too with “principles of variable definitions”, providing detailed information on the use of missing values, how multiple responses are dealt with, and the rationale behind variable names, labels and categories. Country specific variables (for example interview time and region) are also listed, explained and specified, as is the application of international standards and coding frames which can be downloaded in different formats. These acknowledged international standards for the coding of verbatim variables are compiled for use in each round of the ESS and are treated as the only valid standards for that round. They are later included in the study documentation for secondary data users. Examples are: • Occupation: ISCO88 (COM) four-digit version. • Industry: NACE Rev.1. two-digit version. • Country codes (citizenship, country of birth, etc.): ISO 3166-1, twocharacter version. • Language: ISO 639-2. Meanwhile, the ESS has developed its own coding frames for education and religion – the education standard being a slightly modified version of ISCED-97, and the religion frame covering the seven largest religions in the world, based on country-specific coding schemes.
Jowell-Chapter-07.qxd
146
3/9/2007
5:03 PM
Page 146
MEASURING ATTITUDES CROSS-NATIONALLY
It must be said, however, that some of these standard frames have caused difficulties. For instance, ISCO88 was not always coded to the same level of detail in all countries, and the modified ISCED frame has not been without its problems. The countries had to develop their own specific questions to obtain their educational qualifications which could then be mapped into the modified ISCED frame. But this mapping sometimes required a level of knowledge about both the frame and the qualifications that was not always available within the national team. So in Round 3 we intend to address this issue not only by providing teams with more guidance on the mapping of qualifications but also by encouraging greater detail in the country-specific questions employed to obtain the qualifications.
The survey documentation A significant part of providing high-quality ESS data is the provision of meta data documentation – until recently a neglected area. Meta data consist in this case of the data and other text that help to define and describe the main ESS data. Whereas in the past, datasets might sometimes have been deposited without even a questionnaire accompanying them, and certainly without any detailed fieldwork documentation, the ESS places great emphasis on providing comprehensive and structured documentation that enable data users to make use of all the tools they might need to understand and use the data. Mohler and Uher (2003) highlight the general importance of documentation for secondary analysts. But for a project as complex as the ESS, the accessibility and comprehensiveness of such documentation takes on still greater importance. After all, as Clark and Schober (1992) point out, the secondary data analyst shares none of the common ground and collective memory of the primary data producers and users. Moreover, in a cross-national context, users come from a variety of cultural backgrounds and different levels of knowledge about the socio-cultural contexts in which the survey has been carried out. So the importance of adequate documentation becomes even more acute. The very fact that the ESS dataset is so readily accessible makes it even more important that its purpose, context and limitations are clearly documented. To achieve these quality standards, the compilation of relevant meta data must take place in parallel with, or sometimes prior to, the actual data collection. By making the documentation requirements known to all participating bodies in advance of the study, we help to ensure that the documentation will ultimately be comprehensive and in a comparable format. As the official ESS Archive, NSD is responsible for the collection and distribution of the meta data. It thus compiles a list of the components of and specifications for the meta data documentation in advance of ESS fieldwork. Most of the information naturally comes from national teams which provide the key
Jowell-Chapter-07.qxd
3/9/2007
5:03 PM
Page 147
Free and immediate access to data
Figure 7.4
147
The National Technical Summary form (NTS)
information on how the survey was carried out in their countries. But this information is supplemented by information from the Central Coordinating Team and the Sampling Panel. The vehicle for much of the meta data from national teams is a standardised National Technical Summary form (see Figure 7.4). The NTS is made available via the Archive website to all national teams. It covers information about the survey process (fieldwork dates, response rates, fieldwork procedures, etc.) as well as about country-specific contextual information such as the nature of the education system, the party system and demographic composition of the population. The components of the NTS match the structure of the meta data standard adopted by the Data Documentation Initiative,1 allowing the meta data to be presented on the Internet in a standardised and structured way alongside the actual data files. 1
The Data Documentation Initiative is a continuing effort to establish a universally supported meta data standard for the social science community (see http://www.icpsr.umich.edu/DDI/index.html)
Jowell-Chapter-07.qxd
3/9/2007
148
Figure 7.5
5:03 PM
Page 148
MEASURING ATTITUDES CROSS-NATIONALLY
The ESS meta data flow
The National Technical Summary was designed with the twin aims of making the documentation process less arduous for the national teams and of encouraging more standardisation of reporting so as to bolster comparability. To make this documentation speedily and widely available in a userfriendly format, it is stored in a relational database that allows outputs in different formats. So the ESS Data Documentation Report is available as a text report in both Adobe Acrobat (pdf) and Microsoft Word. But there is also an XML/DTD2 file which is read by the NESSTAR3 system and integrated with the data file (see Figure 7.5). Documentation is by no means limited to the summary information collected in the NTS. All documents and information made available to survey respondents, fieldwork agencies and interviewers are also made available to the end users. So the ESS documentation consists of more than 100 different documents including both the source ESS survey documents and the equivalent (and additional) country-specific documents.
2 3
DTD is the Data Type Definition of the DDI. NESSTAR is described further in the next section.
Jowell-Chapter-07.qxd
3/9/2007
5:03 PM
Page 149
Free and immediate access to data
149
In addition to the survey documents per se, we felt the need to produce documentation about the socio-cultural context in which each national survey was carried out. This is because, for instance, attitudes to immigration are often highly related to levels of education on the one hand and volumes of migration on the other. So national teams are also asked to provide some key population statistics to contextualise the questionnaire. Primary among these are age and gender composition, educational attainment figures, and degree of urbanisation. These are supplemented in particular rounds by subject-specific statistics such as about immigration. Meanwhile, NSD supplements the documentation with statistics about national and regional characteristics. Dissemination The most common starting point for the data user even in the recent past was to have to negotiate a range of institutional distribution practices, such as who is eligible to access the data, when they can do so and for what purposes. These hurdles have often not been substantial – involving perhaps the completion of an electronic questionnaire on the Web or placing an order for a CD-rom of the data and waiting for it to arrive. But they have generated lots of complaints from data users anxious to get on with their jobs. So some data archives began distributing data over the Internet and providing the data user with an access code. But a “conditions of use” document then usually had to be completed and signed in advance of data becoming available. The time involved in gaining access to data in these ways naturally varied but it has always been a bone of contention. The funding arrangements and the collective nature of the ESS enable it to distribute data and documentation to potential users much more swiftly, at no cost to the data user, and without the usual legal, institutional and cultural barriers. By using a full range of web-based services for data users, we are able to provide not only quicker access to the data, but also access to the latest technology which allows the user to run simple cross-tabulations and to read the accompanying documentation before having to decide whether or not to download the full dataset. As it turns out, just under half of the registered data users of the ESS do not in fact download the full dataset. But free and immediate access to the data via the Web through a user-friendly dissemination facility has resulted in greater equality of access to all comers. NSD distributes ESS data and documentation on behalf of the CCT by means of two interlinked systems based on Web technology. The overarching system which is visible to the data analyst is the ESS Data Website, http://ess.nsd.utb.no (see Figure 7.6). This website serves as the reservoir for the complete set of ESS data, meta data and documentation. Integrated within this website is the distribution mechanism called NESSTAR.
Jowell-Chapter-07.qxd
3/9/2007
150
Figure 7.6
5:03 PM
Page 150
MEASURING ATTITUDES CROSS-NATIONALLY
Components of the ESS Data Website
The ESS Data Website tries to meet different user requirements and in particular their needs for different levels of documentation. It is divided into sections on direct data download, fieldwork documents, survey documentation, contextual data, and on-line browsing and download. Starting with the download facility, we briefly describe the role and content of the different sections. The Direct download of data section offers access to integrated as well as country-specific files. The integrated files (SPSS and SAS), which are accessible for download, hold data from the core and rotating modules of the ESS, the interviewer’s questionnaire, which describes the interview setting, as well as data from the test questions (used in multi-trait multi-method (MTMM) analysis of reliability and validity) The core and rotating modules are also available as separate country files. The website also offers access to data files with country-specific variables, such as variables that were not collected in an equivalent way, or extra variables that were included only in a specific country. For instance, the country-specific data file from Germany contains all 21 variables on education level that were used to bridge from the national measurement to the standardised ESS variable. In the country-specific datasets are data from ESS contact forms (see chapter 6), and verbatim answers about parents’ occupation in local languages.
Jowell-Chapter-07.qxd
3/9/2007
5:03 PM
Page 151
Free and immediate access to data
151
The Fieldwork documents section contains questionnaires, show cards, contact forms and fieldwork instructions in all languages (see Figure 7.7). So users can simply select a country and bring up country-specific versions of these source documents. Other documents used during fieldwork, like advance letters and brochures, are also included. This allows data users to access the translated questionnaires and to decide for themselves if the translation was functionally equivalent, as compared with the source questionnaire. National documents are, of course, an indispensable source in the disclosure of flaws that may need to be corrected in future rounds of the ESS. So all deviations in the data are documented on-line and in detail in a fieldwork summary. By making all these documents freely available on-line, not only the data user but also the wider survey community may evaluate the survey’s quality and thus be able to replicate (or avoid) certain aspects in future projects. Figure 7.7
Access to fieldwork documents, the ESS Data Website
Jowell-Chapter-07.qxd
3/9/2007
152
5:03 PM
Page 152
MEASURING ATTITUDES CROSS-NATIONALLY
The Survey documentation section contains documents aimed primarily to assist users in the analysis of data. It includes guidelines on how to apply the weights included in the data files, documentation of the national sampling plans and design weights, and reports on question reliability and validity. Survey documentation is found in the ESS Documentation Report, which in turn contains three main sections. The first section includes the overall study description, that is, information about the study itself, its key people and institutions, how to access its data, and a summary description of both the data file and the legal aspects of data use. The second section contains country-specific details such as fieldwork agencies, funders, sampling and fieldwork procedures, response rates, and so on. The third section contains country-by-country population statistics, documentation of classifications and standards used in the survey, plus a list of variables and questions in the main and supplementary questionnaire and variable lists sorted by question number and variable name. The Contextual data section contains three sources of information that help to identify possible socio-cultural influences on the survey data. These consist of links to the ESS Event database (see Chapter 5), an overview of sources to European context data (both compiled by SCP Netherlands), and a spreadsheet compiled by NSD containing macro data at a country and regional level relating to topics such as elections, demographic characteristics, GDP and life expectancy. Finally, the On-line browsing and download section (see Figure 7.8) provides access to the unique system of Networked Social Science Tools and Resources (NESSTAR – see below). Figure 7.8
NESSTAR On-line browsing and download, Study Scope
Jowell-Chapter-07.qxd
3/9/2007
5:03 PM
Page 153
Free and immediate access to data
Figure 7.9
153
NESSTAR variable description, interest in politics
As may be seen from Figure 7.8, the left window of the screen is a hierarchy of folders containing various kinds of information and documentation. The right window then shows the selected item in the hierarchy. The two parts of the screen also provide details of the meta data, the keywords, the topics and the abstract. The hierarchy of the browser tree is organised around two main folders with sub-folders each containing several elements of documentation. The two main folders contain meta data on the one hand and variable descriptions on the other. The system is highly dependent upon structured documentation. In fact the NESSTAR system is built on the standardised tag library Document Type Definition (DTD) developed by the Data Documentation Initiative (DDI) (Ryssevik, 1999). The meta data folder expands to a series of sub-folders containing both country-specific and survey-level information on different aspects of the survey and its procedures, while the variable description folder contains sub-folders representing the different thematic sections of the questionnaire. So, having selected a particular variable, the documentation will be displayed in the window to the right (see Figure 7.9).
Jowell-Chapter-07.qxd
154
3/9/2007
5:03 PM
Page 154
MEASURING ATTITUDES CROSS-NATIONALLY
As can be seen in Figure 7.9, the listing on the right includes the full question asked, together with all answer categories, plus any interviewer instructions for the question. It also displays the variable name, its position in the questionnaire, summary statistics about the variable (valid/missing cases), and any notes or warning related to that question. An option also exists to obtain a tabular view of the distribution of the variable. So the NESSTAR system serves in effect as an electronic codebook. It makes it possible not only to display the distribution of the variables, but also to select them and display them in cross-tabulations (see Figure 7.10). Another useful feature of the system is its ability to create subsets of the data based on the values of a particular variable. So a subset of certain countries or individuals can be created based on the similarity of their characteristics (e.g. full-time workers etc.). It is also possible to carry out simple regressions, and the system offers a weighting option and the possibility of displaying tabular information graphically without having first to download the data. Whether for purposes of analysis, or simply as a means of creating a specific subset of the ESS, the data may also be downloaded in a number of different formats, such as SPSS, SAS and StatA. NESSTAR was originally developed as a project under the European Commission’s Fourth and Fifth Framework Programmes, and is in many ways based on the notion of a social science “dream machine” (Ryssevik and Musgrave, 2001), where researchers – irrespective of where they are located – may have direct on-line access to almost unlimited amounts of empirical data and meta data. So, via NESSTAR, users are able to: • locate multiple data sources across the holdings of several data repositories • browse detailed meta data (documentation) • analyse and visualise the data on-line • download appropriate subsets of data in one of a number of formats for local use. The NESSTAR system is now in use by many other data archives in their dissemination of micro data. For instance, the Zentralarchiv für Empirische Sozialforschung (ZA) now also distributes ISSP (International Social Survey Programme) data from a NESSTAR server. Major changes in access arrangements are now afoot. Thus, NESSTAR’s data search functions across different data repositories is about to be utilised
Jowell-Chapter-07.qxd
3/9/2007
5:03 PM
Page 155
Free and immediate access to data
Figure 7.10
155
NESSTAR table of a weighted subset
in the MADIERA4 portal (Multilingual Access to Data Infrastructures of the European Research Area), which provides a common integrated interface to several existing social science archives in Europe. The ESS is already available through the MADIERA portal. Conclusion The nature of the ESS enterprise and its organisational framework helps to overcome many of the traditional barriers between the producers and users of large social science surveys. We believe we have reduced the barriers still further by providing a data and documentation dissemination facility that encourages widespread use of the data via web-based technology. There are, of course, almost no limits to the amount of information that modern information technology is able to make available, but its use still depends critically on its ease of access and its ability to be used and exploited 4
http://www.madiera.net/ Funded by the European Commission under the Fifth Framework Programme.
Jowell-Chapter-07.qxd
3/9/2007
156
5:03 PM
Page 156
MEASURING ATTITUDES CROSS-NATIONALLY
with a minimum of frustration. But technology alone is not the answer. While it can smooth the path to greater access and user-friendly usage of data, its success will always depend on the preparatory work that goes into its planned use, including unambiguous specifications and meticulous documentation. A virtual infrastructure such as the ESS requires not only sophisticated technology but also detailed planning for, and long-term servicing of, its users’ demands. With over 10,000 registered users of the ESS datasets to date within only three years of the first data release, there is already clear evidence that the investment and planning that has been devoted to the ESS is paying rich dividends. The longstanding gap between data producers and data users on the one hand, and the seemingly permanent inequality between different categories of data user on the other are, we hope, now at last much closer to being eliminated once and for all. References Clark, H.H. and Schober, M.F. (1992), ‘Asking Questions and Influencing Answers’ in: J. Tanur (ed.), Questions about Questions: Inquiries into the Cognitive Bases of Surveys, New York: Springer. Dale, A. and Trivellato, U. (2002), ‘Access to Microdata for Scientific Purposes’, Background paper to the 19th CEIES Seminar, 26–27 September 2002, Lisbon. ESF (European Science Foundation) (1999) Blueprint for a European Social Survey, Strasbourg: ESF. Kolsrud, K. and Skjåk, K.K. (2004), ‘Harmonising Background Variables in International Surveys’, Paper to the RC33 Sixth International Conference on Social Science Methodology, 16–20 August 2004, Amsterdam. Mohler, P. and Uher, R. (2003), ‘Documenting Comparative Surveys for Secondary Analysis’ in: J.A. Harkness, F.J.R. van de Vijver and P. Mohler (eds), Cross-Cultural Survey Methods, New York: Wiley. Ryssevik, J. (1999), ‘Providing Global Access to Distributed Data through Metadata Standardisation – The Parallel Stories of NESSTAR and the DDI’, Conference of European Statisticians UN/ECE Work Session on Statistical Metadata, Geneva, Switzerland, 22–24 September 1999. Ryssevik, J. and Musgrave, S. (2001), ’The Social Science Dream Machine. Resource discovery, Analysis and Delivery on the Web’, Social Science Computer Review, 19 (2), pp.163–174.
Jowell-Chapter-08.qxd
8
3/9/2007
5:03 PM
Page 157
What is being learned from the ESS? Peter Mohler∗
Introduction In their seminal book on survey methods, Groves et al. (2004) investigated and explained the scientific and methodological foundation of social surveys such as the European Social Survey (ESS). Having devoted individual chapters to survey error(s) arising from different elements of a survey such as sampling, questionnaire design, editing data, and so on, their final chapter deals with ‘total survey error’. They start the chapter with what they describe as a familiar pair of questions about sample surveys: “With all the errors … how can anything work? Can you really believe the results of any surveys?” (Groves et al., 2004, p.377) Their answer is as follows: Each chapter (in this book) essentially describes a number of potential threats to the quality of survey statistics and what survey methodology teaches us about how to address those threats. In that context, it might be easy to overestimate the magnitudes of error in surveys. Certainly, numerous surveys are done poorly, by the standards articulated in this book, and they produce data of dubious quality. However, many surveys have incorporated excellent procedures and the results are often very impressive in terms of the quality of data that result. Probability sample surveys are routinely done that collect data from respondents who collectively look very much like the population from which they were
∗ Peter Mohler is Director of Zentrum für Umfragen, Methoden und Analysen (ZUMA), Mannheim, Germany, and Professor at Mannheim University.
Jowell-Chapter-08.qxd
158
3/9/2007
5:03 PM
Page 158
MEASURING ATTITUDES CROSS-NATIONALLY
drawn. …. Methodological research demonstrates that properly conducted surveys can achieve very high quality results. (p.377–378) A few national attitude time series immediately come to mind as examples of ‘properly conducted surveys’, such as the US General Social Survey, the German ALLBUS, the Polish General Social Survey, and the British Social Attitudes Survey (BSA), on which a number of impressive look-alikes in Australia, South Africa and elsewhere have been based. In contrast, the quality of comparative attitude surveys has always tended to be less impressive (Jowell, 1998; Harkness et al., 2003). So the ESS set out from the start to do better. To the extent that it is succeeding in addressing and rectifying the sorts of error identified in Groves et al. (2004) in a multi-national setting, it marks a milestone of sorts in comparative survey quality. From now on, it is hoped that the scientific and policy communities will no longer accept lower standards in important cross-national surveys than they would ever do in important national surveys. On the other hand, the success of the ESS owes a great deal to a predecessor comparative study – the International Social Survey Programme (ISSP) – which started in 1984 and is still going strong. It is the offshoot of four high-quality national attitude surveys, representatives of whom got together and agreed to collaborate in developing a common 15-minute supplement that would be added to each of their annual questionnaires.1 The fact that the four surveys themselves employed similar high-quality methods meant that the cross-national element started off with the advantage of rigour. The ISSP’s first joint module was on the ‘role of government’, which has now had four iterations around five years apart, interrupted by modules on other subjects. Before long, other countries with similar national surveys asked to join the ISSP fold and enterprise has thus expanded over the years. It now includes 40 countries, each of which attempts to run the agreed module each year. Admittedly, however, its expansion has meant that its standards have become more variable than they were at the beginning. The fact is that each country has to finance its own participation, and some countries simply do not have the resources to achieve the high standards laid down in the ISSP protocols. The result is that the sorts of errors identified by Groves et al. (2004) and Biemer and Lyberg (2003) in respect of national surveys, and those identified by Harkness et al. (2003) in respect of cross-national surveys are still too much in evidence in the ISSP.
1
Two of those representatives, Peter Mohler and Roger Jowell, were later to be involved in setting up the ESS.
Jowell-Chapter-08.qxd
3/9/2007
5:03 PM
Page 159
What is being learned from the ESS?
159
Consistency When the idea of the ESS began to emerge, it was designed and also structured to avoid the weaknesses inherent not only in the ISSP model but in that of various other multinational time series. Above all, regardless of their initial aims, they ultimately lacked the ability to impose and maintain consistent standards over time. In contrast, the ESS set out to embrace continuing survey quality as its top priority and, to do this, it had to be sure that the resources needed to achieve that goal would be available. To appreciate the complexity of the task involved, it is as well to contrast it with the task of mounting a successful single-nation survey. A national survey with similar goals to those of the ESS, would have to select its sample by familiar well-tested methods, design and test its questionnaire, and subsequently employ around 200 or more interviewers and many fewer coding staff to contact, say, the 3000 addresses or households that will yield around 2000 respondents. The questionnaire might contain about 400 variables per respondent, so in the end about 800,000 single measurements (400 x 2000) will make up the data file for such a survey. To design, implement and successfully conduct such a national survey, a number of quite distinct methods and techniques are thus combined into a single survey process. Among these are cognitive processes that enable substantive research questions to be transformed into appropriate survey items, sampling statistics which govern the design and implementation of the sample, logistical and process quality methods which enable efficient and successful data collection, content analysis methods which enhance coding, documentation and data transfer techniques which convert the answers into an analysable dataset, and data analysis techniques (combined with other quality measures) designed first to assess data quality and then to interpret the findings. It is, however, a quantum leap in complexity when one moves from the national to the multi-national arena in survey design and implementation. A cross-national survey of, say, 20 countries will produce 16 million measurements instead of 800,000; it will require 4000 interviewers in all rather than 200; it will involve contacting some 60,000 addresses or households (rather than 3000) to achieve 40,000 respondents (rather than 2000). Twenty different field organisations, as opposed to one, will be involved in the data collection and they all have to be closely coordinated. The 20 resulting data files cannot then simply be processed and analysed as if they were additive. On the contrary, as other multi-national surveys have discovered over more than two decades, a great deal of modification and adaptation is needed before a single combined data file can emerge. Surely then, a project such as the ESS which set out to do all this in a consistently rigorous way was doomed to failure? Well, to some extent, of course, it does fail. As chapters 1 and 6 show, there are certainly deviations
Jowell-Chapter-08.qxd
3/9/2007
160
5:03 PM
Page 160
MEASURING ATTITUDES CROSS-NATIONALLY
from the specification and there are errors too. But the general picture is comprehensively better than some predicted and many feared. It is fair to say that the ESS is setting new quality benchmarks which other surveys are already attempting to meet. Most prominent among these benchmarks are: • a new approach to transparency • a multi-national, multi-disciplinary central coordinating team • an innovative system for rigorous probability sample designs in all countries • a readily available data source of non-response bias and other measurement errors • a new approach to questionnaire translations • speedy and free access to data, with no ‘privileged’ access • capacity building for comparative empirical research. Each of these new benchmarks is discussed in the paragraphs below. Transparency Every scientific instrument or publication ought to be open to examination and criticism by other scientists who must in principle be able, wherever possible, to replicate its methodology. This involves making all the relevant steps in the methodology readily and freely available. Thus, as far as multi-national surveys are concerned, standards of documentation must be such as to specify and make available, among other things, precise details of the sample designs in each country, the conceptual framework of the questionnaire, the method and product of its translations, a codebook of its results2 and, of course, the actual datasets themselves. In addition, paradata such as fieldwork contact protocols must be documented and available, and any weighting procedures must be justified and recorded. It is not enough merely to note all this information and then store it. The principle of transparency requires that any scientist who wishes to have free access to this information should be able conveniently to do so. Until now, these sorts of documents have been unavailable to ‘outsiders’ either forever, or, at least, for a prohibitively long period of time. In many cases they are simply not collected or stored in the first place. The ESS routinely provides these sorts of data on its main or data website, both available via www.europeansocialsurvey.org. So, for instance, paradata derived form the ESS’s contact forms about the implementation of its sampling 2
A codebook for the ESS, compiled by Sally Widdop, is now being printed which covers the first two rounds of the time series. It will be available in early 2007.
Jowell-Chapter-08.qxd
3/9/2007
5:03 PM
Page 161
What is being learned from the ESS?
161
procedures at the individual level enable non-response bias to be estimated. Making such data freely available to the scientific community in ‘real time’ is a major step towards accurate quality assessment and ultimately leads to improvements in survey measurement. Although all surveys, and especially complex multi-national surveys, are inevitably subject to error, what can and should be avoided as far as possible are undetected errors which nonetheless damage the analytical potential of the survey. Identifying and correcting for errors is far preferable, and helps to prevent the future occurrence of similar errors. This process should be at the heart of high quality management. The long-term aim of the ESS is not to be unusual or unique in providing free and immediate access to its paradata. That would not be in the ESS’s interests, suggesting it is unusually error-prone, and one of our primary aims is to influence practice more widely. We need to move into a situation where trust in the data and its equivalence must not be assumed but needs to be established (van de Vijver and Leung, 1997). All interested parties must be able to challenge or defend the data quality of a project or time series with full knowledge of its strengths and weaknesses. Coordination and management Multi-national surveys such as the ESS can and have been organised in many ways. Some (such as the ISSP) are informal collaborations of national research institutes. Others (such as the World Health Organisation’s surveys) are contracted by an international body. Still others (such as the Survey of Health and Retirement in Europe) are voluntary associations of organisations which are then centrally funded. ESS is unusual insofar as it is centrally organised and directed but has decentralised funding (see chapter 1). Responsibilities are divided between members of the ESS Central Coordinating Team (CCT) and the larger body of National Coordinators (NCs). While the CCT directs the survey and is funded by the EC, the NCs are responsible for its national implementation and are funded by their own academic funding councils. The CCT is a multi-disciplinary team of substantive and methodological specialists, and includes statisticians, political scientists, sociologists, psychologists, linguists, archivists, quality process managers, survey design specialists and cross-cultural specialists. It is also truly multi-national. Fieldwork is undertaken not by any of the multi-national survey agencies, but by an eclectic mixture of field agencies that includes commercial houses and national statistical institutes. In short, ESS survey management is a combination of federal control with independent national implementation. And it seems to suit the needs of a large, highly dispersed cross-national survey that places consistent rigour at the top of its agenda.
Jowell-Chapter-08.qxd
162
3/9/2007
5:03 PM
Page 162
MEASURING ATTITUDES CROSS-NATIONALLY
Innovative probability samples Scientific research requires probability sampling methods. In a survey such as the ESS it means that in each nation, every resident aged 15 and over should have a known (not necessarily equal), non-zero probability of selection. Unequal chances of selection then need to be corrected by subsequent weighting. This is not easy to achieve in a cross-national survey where each country has different existing sampling frames within a wide range of geographical, institutional, cultural and legal settings. Moreover, since good probability sampling is facilitated by accurate official statistics with detailed regional information and/or well-maintained population registers, there is even more variation. For these reasons, perhaps, many cross-national surveys have baulked at the challenge of achieving strictly equivalent random samples in each country. The ESS introduced two concepts into its sampling design (see chapter 2), in order to ensure that each achieved national sample (not just each starting – or gross – sample) was of the same effective size – after allowing for nonresponse and design effects. The first was the setting up of a specialist Sampling Panel, which works closely with individual NCs to ensure that the properties of each sample are equivalent. And the second is the requirement that each sample should be designed to result in the same statistically “effective sample size”, thus correcting in advance for various effects of multistage sampling. So, for the majority of ESS countries that use multi-stage samples, this leads to a starting sample that must be (calculably) larger than would otherwise be the case, even after allowing for non-response. The increase in the number of starting interviews depends on the nature of each country’s proposed multi-stage sampling design. All these calculations are carried out in conjunction with a member of the Sampling Panel, and the whole panel must eventually sign off each sample design. All this adds up to a degree of sampling equivalence that we believe is unprecedented in cross-national surveys. Moreover, the actual outcomes of each round of the ESS are then used by the Sampling Panel to adjust their calculations for subsequent rounds, leading to a built-in process of round by round improvements in precision. As always, the whole process is documented and available to other researchers. A source of data on error and bias As Noelle-Neumann and Peterson (2000) put it, the process of sampling allows us to make inferences about everyone without asking everyone. But this
Jowell-Chapter-08.qxd
3/9/2007
5:03 PM
Page 163
What is being learned from the ESS?
163
assumes not only an accurate starting sample, but also that the unattainable ideal of 100 per cent response rates are achieved. In reality, not all persons selected can be contacted, some might not be willing to take part, and some might drop out for other reasons. So the question then arises as to whether those not taking part are systematically different from those taking part. If so, then the resulting sample is unlikely to be an exact microcosm of the population after all. Research on bias due to non-response has identified a number of systematic differences between respondents and non-respondents (Groves and Couper 1998; Koch 2002; Groves 2005; Stoop 2005). The consensus is that younger (mostly male) working people who live in single-person households are more likely to be missing from surveys, as are elderly (mostly female) retired people who live in single-person households. People with less education are also less likely to take part than others, as are city dwellers. The result is that survey measurements that are related to these characteristics, or to combinations of them, are more likely to be biased. As noted, however, they are not necessarily biased, and there are a number of measures that can identify the extent of bias post hoc. Even so, it is always advisable to reduce these sources of possible bias in advance by adapting contact protocols accordingly. Obtaining high response rates overall might not achieve this, because there turns out to be no clear relationship between overall response rates and non-response bias. It all depends on the characteristics of the non-respondents in relation to the survey variables. It follows, therefore, that we need to investigate possible non-response bias, irrespective of the response rate. But to do so requires easy access to paradata, such as population characteristics, as well as basic data about the sample composition. Some of these data emerge from certain sampling frames (such as population registers). In any event, the ESS collects as much such data as possible (within the confines of privacy concerns) and makes them available on its website (see chapter 7). But sampling errors and bias are only one form of hazard for surveys. Other errors (known as measurement errors) occur during the very process of data collection (see, for instance, Saris, 1998). Although there are standard statistical tools to check for measurement error and bias, comparative research brings with it the problem of cultural differences which are beyond the standard tools. Van de Vijver and Leung (1997) discuss a wide range of additional statistical tools necessary to remedy some of the bias and errors that are found in comparative data. But there are limits to such techniques too, since they cannot deal with single-item indicators. Dealing statistically with latent constructs requires a minimum of two indicators. But respondents might become annoyed if asked twice about the same thing, and it is sometimes difficult to find two different convincing indicators (see Saris, 1998, p.82). Moreover insisting on two indicators for
Jowell-Chapter-08.qxd
3/9/2007
164
5:03 PM
Page 164
MEASURING ATTITUDES CROSS-NATIONALLY
every topic in a questionnaire would be impracticable. To deal with this dilemma, the ESS has adopted methods and techniques developed during the 1990s by Saris and Andrews, and others which allow for the identification and correction of measurement errors for single indicators across surveys and countries (see chapter 3; Saris and Andrews, 1991). So the ESS openly provides data that enable both non-response and measurement error to be addressed in a multi-national setting (see the website of the household non-response survey workshop www.nonresponse.org). Translation Survey translations have long suffered from the myth that because some of us are able to talk and read in two languages or more, we are therefore equipped to undertake translations into those languages. To aid the process, a mechanical method of assessing the quality of survey translations was devised, known as ‘back translation’. The problem with back translation is not just that it is circular and often misleading (Mohler, 2005), but also that it takes no account of the theories, methods and findings in the field of linguistics (Harkness et al., 2003). Another more theory-grounded approach needed to be developed. This innovative approach to survey translations acknowledges that translating survey questions, response scales and instructions for interviewers requires skills from several distinct fields – including substantive social science, statistical methods, questionnaire design, translation and cognitive psychology. This necessitates a team approach where translators work in close co-operation with survey specialists in each country. It is this approach that the ESS has adopted, full details of which can be found in chapter 4. In addition, selected sections of the questionnaire within different languages are assessed and evaluated as a form of quality control that was initiated after Round 1 of the ESS. Though time-consuming, this quality assurance procedure allows us not only to obtain a deeper insight into how the translation process works in practice, but also to identify procedures which could be improved. Once again, the ESS’s goal is continuous quality improvement. And, like all other features of the project, all the translation protocols are made freely available on the ESS website. Free and easy access to data
As noted, the ESS has followed the model of general social surveys in making their fully documented data freely, speedily and easily available to the social science community (in the ESS’s case within nine months of the scheduled end of fieldwork). Past comparative surveys have more or less always been beset
Jowell-Chapter-08.qxd
3/9/2007
5:03 PM
Page 165
What is being learned from the ESS?
165
by longer delays, either for organisational reasons, or – as commonly – because they impose delays while the principal investigators prepare their own substantive publications. The unprecedented speed of the ESS is facilitated not only by the efficiency of the ESS Data Archive (see chapter 7) but also by the ESS’s overall management and funding structure which allocates a special budget to its archival tasks and divides its preparatory work in a complementary way. The speed of access to ESS data is particularly important because – as the number of registered users and queue of publications demonstrates – it allows analyses to begin within months rather than years of the data’s currency. A German saying on craftsmanship holds that “it takes twice as long to prepare and clean up the workshop (in advance) than the actual work takes”. This saying is true for survey data analysis as well. Researchers working on a hitherto unfamiliar dataset require ‘clean’, meticulously documented data if they are to be able to concentrate first on different alternatives for their analysis and then undertake the analysis itself. This is a far cry from the usual experience. The aim is to prevent users from having to clean their workshops first. The overwhelming response of the wider research community to the ESS dataset suggests we are on the right track. As noted, the evidence for this is in the number of substantive books, articles, papers and presentations that have been published very soon after the data release by scholars who had no part in the ESS design or implementation. The fact that over 10,000 users (many from outside Europe) have already registered on the ESS data website suggests that further work is already in the pipeline, more than fulfilling the ESS’s ambitious aims. Moreover, the recent Infrastructure grant that has been awarded to the ESS by the European Commission (EC) contains provision for upgrading its websites. In addition to its impact on comparative academic research, the ESS is also likely to have an impact on European governance by providing a reliable, timely and reliable source of information about the state of social attitude change in Europe. In time we hope that the ESS may also contribute material that enables social indicators (including attitudinal indicators) of societal change to complement the predominantly economical indicators which have been available hitherto.
Capacity building In the tradition of general social surveys which have always been used as a rich source of material for teaching and training, the ESS has from the outset seen itself as a source of capacity building more generally. Already both its methodology and its substantive data are used in course material within many European universities and beyond. The free and straightforward access to ESS methods
Jowell-Chapter-08.qxd
3/9/2007
166
5:03 PM
Page 166
MEASURING ATTITUDES CROSS-NATIONALLY
and findings helps considerably in removing barriers whether for teachers or students. But the ESS Edunet training package (http://essedunet.nsd.uib.no), which became available in the autumn of 2005, now increases this potential dramatically by providing in-depth teaching material prepared jointly by the initial authors of the questions concerned and specialists in data analysis. The material includes carefully crafted exercises for students. Two on-line teaching modules on data analysis techniques, one using the citizenship engagement module of the ESS (a Round 1 rotating module), the other on the Human Values Scale (part of the ESS core) are already up and running, and others are in preparation. In addition, the new Infrastructure grant from the EC supports a series of formal face-to-face methods training courses which are being set up by ZUMA and the University of Ljubljana and will take place in Ljubljana. Apart from formal teaching as a form of capacity building, however, the ESS increasingly seems to act as an informal training ground for survey researchers and field institutes all over Europe simply through the size and scope of its round by round activities. Researchers who had never dealt with the notion of ‘effective’ sample sizes or the ESS’s innovative form of translations are fast coming to grips with them. Field institutes that were horrified at first with the ESS’s complex contact sheets or detailed documentation standards are now beginning to regard them with more equanimity. And so on. There is no doubt that the brash new kid on the block is now being accepted into the fold. It may perhaps be sometime before the ESS methods become normative, but they are already more widely understood and appreciated than they were and they will, in time, become more so.
Conclusion As noted, the influence of the ESS on quantitative social science measurement is already taking hold, and not only in Europe. Conference papers about or based on the ESS began appearing almost from the start, but now journal articles and books giving more extended treatment are becoming commonplace. The ESS has in some respects been an experiment. Although it employs well-established high-quality survey methods and procedures, its uncompromisingly high standards are perhaps unique for comparative attitudinal surveys, whether in Europe or elsewhere. But in a surprisingly short time the ESS began being described – by funders as well as practitioners, by civil servants as well as academics – as the new gold standard by which comparative social surveys ought to be judged. Although the ESS still contains errors and there are still inevitable deviations from its grand design, the project has
Jowell-Chapter-08.qxd
3/9/2007
5:03 PM
Page 167
What is being learned from the ESS?
167
acquired an unusual authority. The test will be whether it can maintain and improve its standards over time. For the ESS team, one of the most noticeable impacts has been on the attitudes of its many funding organisations throughout Europe. True, the initiative for the ESS came from the academic funding world via the European Science Foundation. But the possibility was always there that the relative cost of the ESS would make them think again. As far as we are aware, this problem has barely surfaced. Moreover the EC has likewise continued its support (indeed increased it substantially via the Infrastructure grant) over the years. There seems to be a growing realisation that important decisions need to be informed by high-quality data and that in the past some funders may have allowed themselves to be satisfied with lower than optimal standards. Naturally there will always be tight cost constraints. But perhaps the ESS is helping to re-define the notion of what is cost-effective and, as importantly, what is not – in particular that high-quality measurements do not come at bargain basement prices, and that the maxim that “some data are better than no data” may well be counterproductive, the natural and physical sciences discovered that long ago. The fact is that the ESS seems to be setting new standards of transparency, quality and precision. It has made important methodological innovations. As a result, its data are already being quarried and analysed by unexpectedly large numbers of scholars, and its methods are having increasing influence on survey practice. But the ESS team needs to bear in mind that these achievements may be short-lived unless it pays as much attention to these goals in its adolescence as it has done during its formative years.
References Biemer, P. and Lyberg, L.E. (2003), Introduction to Survey Quality, Hoboken, NJ: Wiley and Sons. Groves, R.M. (2005), Research Synthesis: Non-response Rates and Non-response Error in Household Survey, Ann Arbor: preprint (under review). Groves, R.M. and Couper, M.P. (1998), Non-response in Household Interview Surveys, New York: Wiley and Sons. Groves, R.M., Fowler, F.J., Couper, M.P., Lepkowski, J.M., Singer, E. and Tourangeau, R. (2004), Survey Methodology, Hoboken, NJ: Wiley and Sons. Harkness, J.A., van de Vijver, F.J.R. and Mohler, P. (eds), (2003), Cross-cultural Survey Methods, Hoboken, NJ: Wiley and Sons. Jowell, R. (1998), ‘How Comparative is Comparative Research?’, American Behavioral Scientist, 42 (2), pp.168–177. Koch, A. (2002), ‘20 Jahre Feldarbeit im ALLBUS: Ein Blick in die Blackbox’, ZUMA-Nachrichten, 51 (26), pp.9–37.
Jowell-Chapter-08.qxd
168
3/9/2007
5:03 PM
Page 168
MEASURING ATTITUDES CROSS-NATIONALLY
Mohler, P. (2005), Recognising the obvious or getting your scales right in cross-cultural surveys, Mannheim: ZUMA. Noelle-Neumann, E. and Petersen, T. (2000), Alle nicht jeder – Einführung in die Methoden der Demoskopie, Heidelberg: Springer. Saris, W.E. (1998), ‘The Effects of Measurement Error in Survey Research’ in: J.A. Harkness (ed.), Cross-Cultural Survey Equivalence – ZUMA-Nachrichten Spezial, 3, Mannheim: ZUMA, pp.67–86. Saris, W.E. and Andrews, F.M. (1991), ‘Evaluation of Measurement Instruments using a Structural Equation Modelling Approach’ in: P.P. Biemer, R.M. Groves, L.E. Lyberg, N.A. Mathiowetz and S. Sudman (eds), Measurement Errors in Surveys, New York: Wiley. Stoop, I. (2005), The Hunt for the Last Respondent, Den Haag: SCP. Van de Vijver, F.J.R. and Leung, K. (1997), Methods and Data Analysis for CrossCultural Research, Newbury Park, CA: Sage.
Jowell-Chapter-09.qxd
9
3/9/2007
6:44 PM
Page 169
Value orientations: measurement, antecedents and consequences across nations Shalom H. Schwartz∗
Introduction Values are central to public discourse today. Competing groups demand priority for the values they hold dear; they argue that conflicting values are unworthy. Why do values draw such interest? Theorists have long considered values central to understanding social behaviour (e.g. Allport et al., 1960; Kluckhohn, 1951; Rokeach, 1973; Williams, 1968). This is because they view values as deeply rooted, abstract motivations that guide, justify and explain attitudes, norms, opinions and actions (Feather, 1985; Halman and de Moor, 1994; Rokeach, 1973; Schwartz, 1992). Individual persons have different value priorities and the prevailing value emphases in societies differ too. Hence, values have predictive and explanatory potential at both the individual and the societal levels. Moreover, values can reflect major social change in societies and across nations. And values may influence the direction and speed of social change. Survey researchers, like other social scientists, view values as basic, abstract motivations. In practice, however, they distinguish little between values and attitudes (Halman and de Moor, 1994, p. 22). They usually measure values with sets of attitude questions in specific domains of life such as religion, morality, politics and work, and they sometimes infer or derive broad underlying * Shalom H. Schwartz is the Sznajderman Professor Emeritus of Psychology at the Hebrew University of Jerusalem, Israel.
Jowell-Chapter-09.qxd
170
3/9/2007
6:44 PM
Page 170
MEASURING ATTITUDES CROSS-NATIONALLY
values statistically from arrays of attitudes, such as tradition vs. secular-rational values (Inglehart, 1997). The confounding of values and attitudes reflects in part the absence of a comprehensive theory of the basic motivations that are socially expressed as values. It also reflects the lack of a theory-based instrument to measure basic values. Consequently, most empirical studies of values provide less integrated and more piecemeal understandings of socially meaningful issues than is desirable. What is required ideally is a value theory and instruments that represent the broad and basic motivations relevant to a wide variety of attitudes and behaviour across the different domains of life. This chapter presents a theory of ten basic human values intended to encompass the major distinctive value orientations recognised across cultures. This theory guided the development of the ‘human values scale’ of the ESS. In this chapter I describe the scale and its scoring and assess how successfully it measures the ten values in the ESS countries. Then I discuss relations of people’s value priorities to background variables that might influence them (e.g. age, gender, education and income). I go on to compare the average value priorities of individuals across countries and interpret national differences and similarities. Finally, I illustrate the use of individual value priorities to explain individual differences in attitudes and behaviour, and cultural value orientations to explain national differences. I examine links of values to attitudes toward accepting immigrants, to components of social capital, and to political activism. But first I discuss the nature of values and implications for measurement. The nature of values A consensus has emerged gradually over the last 50 years or so about how to conceptualise basic values (Braithwaite and Scott, 1991). The conception implicit in the writings of many theorists1 includes six main features (Schwartz, 2005a): 1. Values are beliefs that are linked inextricably to affect. When values are activated, they become infused with feeling. People for whom independence is an important value become aroused if their independence is threatened, despair when they are helpless to protect it, and are happy when they can enjoy it. 2. Values refer to desirable goals that motivate action. People for whom social order, justice, and helpfulness are important values are motivated to promote these goals. 1
e.g. Kluckhohn, 1951; Morris, 1956; Allport, 1961; Scott, 1965; Kohn, 1969; Rokeach, 1973; Hofstede, 1980; Feather, 1985; Schwartz and Bilsky, 1987, Inglehart, 1997.
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 171
Value orientations: measurement, antecedents and consequences
171
3. Values transcend specific actions and situations. Obedience and honesty, for example, are values that may be relevant at work or in school, in sports, business and politics, with family, friends or strangers. This feature distinguishes values from narrower concepts like norms and attitudes that usually refer to specific actions, objects or situations. 4. Values serve as standards or criteria. Values guide the selection or evaluation of actions, policies, people and events. People decide what is good or bad, justified or illegitimate, worth doing or avoiding, by considering the effects on attaining their cherished values. 5. Values are ordered by importance relative to one another. The ordered set of values forms a system of value priorities. Societies and individuals can be characterised by their systems of value priorities. Do people attribute more importance to achievement or justice, to novelty or to tradition? This hierarchical feature also distinguishes values from norms and attitudes. 6. The relative importance of multiple values guides action. Any attitude or behaviour typically has implications for more than one value. For example, attending church might express and promote tradition, conformity, and security values for a person at the expense of hedonism and stimulation values. The trade-off among relevant, competing values is what guides attitudes and behaviours (Tetlock, 1986; Schwartz, 1992, 1996). Values contribute to action to the extent that they are relevant in the context (hence likely to be activated) and important to the actor. Basic value orientations serve not only as independent variables. They also reflect the influences to which individuals and groups are exposed. Theorists trace differences in value priorities to two major sources. First are needs or inborn temperaments (Rokeach, 1973). This source sets limits on the value priorities that a group or society can socialise or transmit successfully. Basic needs constrain socialisers to accept non-negligible emphases on hedonism and stimulation values, for example, even if these values sometimes disrupt smooth social functioning (Schwartz and Bardi, 2001). People evolve value priorities that cope simultaneously with their basic needs and with the opportunities and barriers in their environment, the ideas of what is legitimate or forbidden. The other major source refers directly to social experience. The experiences people share because of their common locations in the social structure (such as their age, gender and occupation) influence their value priorities (Kohn, 1969; Inglehart, 1997; Schwartz and Bardi, 1997; Schwartz, 2005b). Unique experiences (such as trauma or migration) also affect individuals’ value priorities (Feather, 1985). Comparing the value priorities of groups and individuals can therefore reveal the impacts of distinctive experiences (e.g. illness, retirement) and of major social changes (e.g. economic depressions, political upheavals).
Jowell-Chapter-09.qxd
172
3/9/2007
6:44 PM
Page 172
MEASURING ATTITUDES CROSS-NATIONALLY
Current survey practice and the conception of values Implicitly, most survey researchers hold conceptions of values close to the one outlined above. However, the many value items that appear in surveys are consistent only with affect, goals and standards (features 1, 2 and 4 above). They do not cover values that transcend actions and situations (feature 3 above), because the items often refer to specific situations or domains. Such items do not measure ‘basic’ values in the sense of ones that are relevant across virtually all situations. This affects individual differences in value priorities. Consider the survey item ‘giving people more say in important government decisions’ that is used in the materialism/post-materialism scale (Inglehart, 1997). Individual respondents’ support for or opposition to the current government influences the importance they attribute to the goal, as found in data we gathered in Israel in 1999. The meaning of such items depends on the interaction between a person’s ‘basic’ values and the context and domain in which the items are measured. Researchers often combine responses to items from a number of specific domains in order to infer underlying, trans-situational values (e.g. materialism). But, because situation-specific items are sensitive to prevailing socio-political conditions, the choice of items may still substantially influence both group and individual-level priorities. For example, Clarke et al. (1999) demonstrate the effect of substituting Inglehart’s materialism item ‘creating more jobs’, from his four-item battery, and replacing it with ‘fighting rising prices’. When unemployment is high, the item referring to jobs yields more materialists; when inflation is high, however, the item referring to prices yields more materialists. Contrary to features 5 and 6 – both of which refer to the importance of particular values – many survey items do not measure values in terms of importance. Instead, they present attitude or opinion statements and employ agree–disagree, approve–disapprove, or other evaluative response scales. The researcher may then try to infer indirectly the importance of the values presumed to underlie these attitudes or opinions. But multiple values may underlie any given attitude or opinion. Hence, it is hazardous to infer basic value priorities from responses to specific attitude and opinion items. In order to discover basic values with this approach, one must ask numerous questions across many domains of content. One then searches for underlying consistencies of response which may or may not be present. Such an approach requires many items and may not discern clear sets of basic value priorities. Inglehart adopted this approach in deriving his two updated dimensions of culture. He describes the tradition/secular-rational dimension, for example, as centrally concerned with orientations toward authority (Inglehart and Baker, 2000). He bases this on five items that load together in a factor analysis (importance of God, importance of obedience and religious faith for children, justifiability of abortion, sense of national pride, and attitude toward respect for authority). The secular-rational pole of this orientation is not
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 173
Value orientations: measurement, antecedents and consequences
173
measured directly. It is inferred from responses that reject these five values and attitudes. The two items that load most strongly on this factor both concern religion. The broader meaning of this dimension is inferred from the correlations of the five-item index with various beliefs and attitudes. Inglehart’s simplification of the value domain into two dimensions may have proved useful in assessing differences among cultures, but not for studying individual differences. The meaning of his dimensions is necessarily loose because they are derived by inference from correlations among diverse items. These dimensions were not clearly defined or operationalised a priori. Moreover, two dimensions can hardly capture the richness of individual and cultural differences in values. For that purpose, a more finely tuned set of basic values is needed. A further limitation of most value instruments (though not of the World Value Survey) is their focus on one or only a few value domains (religion, or politics, or co-operation/competition, or conformity/self-direction). Even approaches that aspire to comprehensiveness (e.g. Rokeach, 1973) have overlooked basic values (tradition, power) that significantly influence behaviour. The ESS human values scale derives from a theory intended to identify the major distinctive values recognised across cultures. A theory of the content and structure of basic human values
Ten basic types of value In this theory, values are defined as desirable, trans-situational goals, varying in importance, that serve as guiding principles in people’s lives. The crucial content aspect that distinguishes values is the type of motivational goal they express. People must coordinate with others in the pursuit of the goals that are important to them. Hence, groups and individuals represent these requirements cognitively (linguistically) as specific values about which they communicate. I derived ten, motivationally distinct, broad and basic values from three universal requirements of the human condition: the needs of individuals as biological organisms; the requisites of coordinated social interaction; and the survival and welfare needs of groups (Schwartz, 1992, 2005a). For instance, conformity values derive from the prerequisites of interaction and of group survival. For interaction to proceed smoothly and for groups to thrive, individuals must restrain impulses and inhibit actions that might hurt others. Self-direction values derive from needs for mastery and from the interaction requirements of autonomy and independence. The ten basic values are intended to include all the core values recognised in cultures around the world, covering the distinct content categories found in earlier value theories, in value questionnaires from different cultures, and in religious and philosophical discussions of values. Virtually all the items found
Jowell-Chapter-09.qxd
3/9/2007
174
6:44 PM
Page 174
MEASURING ATTITUDES CROSS-NATIONALLY
in lists of specific values from different cultures2 express one of these ten motivationally distinct basic values. The core motivational goal of each distinct basic value, presented in Table 9.1, defines the value. Most research with the value theory utilised a 56- or 57-item list of abstract value items, such as creativity, wealth and honesty (Schwartz, 1992, 2005a). Respondents rated each one for importance as a guiding principle in their life. Multi-dimensional analyses of the relations among the single items – based on 210 samples from 67 countries – supported the discrimination of the postulated ten basic values. Confirmatory factor analyses of data from 23 countries yielded similar results (Schwartz and Boehnke, 2004). Comparing the results from each society established that 46 of the items had nearly equivalent meanings across cultures. The ESS 21-item values scale draws upon these items. Table 9.1
Definitions of motivational types of values in terms of their core goal
POWER ACHIEVEMENT HEDONISM STIMULATION SELF DIRECTION UNIVERSALISM BENEVOLENCE
TRADITION
CONFORMITY
SECURITY
Social status and prestige, control or dominance over people and resources Personal success through demonstrating competence according to social standards Pleasure and sensuous gratification for oneself Excitement, novelty, and challenge in life Independent thought and action choosing, creating, exploring Understanding, appreciation, tolerance and protection for the welfare of all people and for nature Preservation and enhancement of the welfare of people with whom one is in frequent personal contact Respect for, commitment to and acceptance of the customs and ideas that traditional culture or religion provide for the self Restraint of actions, inclinations and impulses likely to upset or harm others and violate social expectations or norms Safety, harmony and stability of society, of relationships, and of self
The structure of value relations In addition to identifying ten basic values, the theory explicates the structure of dynamic relations among these values. The source of the value structure is the fact that actions in pursuit of any value have consequences that conflict with some values and are congruent with others. For example, pursuing 2
Scott, 1965, Dempsey and Dukes, 1966; Gordon, 1967; Bales and Couch, 1969; Lorr et al, 1973; Rokeach, 1973; Braithwaite and Law, 1985; Fitzsimmons et al., 1985; Chinese Culture Connection, 1987.
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 175
Value orientations: measurement, antecedents and consequences
175
achievement values may conflict with pursuing benevolence values. Seeking success for self is likely to obstruct actions aimed at enhancing the welfare of others who need one’s help. But pursuing both achievement and power values may be compatible. Seeking personal success for oneself tends to strengthen and be strengthened by actions aimed at enhancing one’s own social position and authority over others. Similarly, pursuing novelty and change (stimulation values) is likely to undermine preserving time-honoured customs (tradition values). But pursuing tradition values is congruent with pursuing conformity values. Both motivate actions of submission to external expectations. The circular structure in Figure 9.1 portrays the postulated total pattern of relations of conflict and congruity among values. Figure 9.1 value
Theoretical model of relations among ten motivational types of
The circular arrangement of the values represents a motivational continuum. The closer any two values are in either direction around the circle, the more similar are their underlying motivations. The more distant any two values are, the more antagonistic are their underlying motivations. The idea that values form a motivational continuum has another critical implication, namely that the division of the domain of value items into ten distinct values is an arbitrary convenience. It is reasonable to partition the domain of value items into more or less finely tuned distinct values according to the needs and objectives of one’s analysis.
Jowell-Chapter-09.qxd
176
3/9/2007
6:44 PM
Page 176
MEASURING ATTITUDES CROSS-NATIONALLY
The conflicts and congruities among all ten basic values yield an integrated structure of values. Two orthogonal dimensions may summarise this structure. Self-enhancement vs. self-transcendence: on this dimension, power and achievement values oppose universalism and benevolence values. Both of the former emphasise self-interest, whereas both of the latter involve concern for the welfare and interests of others. Similarly, on the openness to change vs. conservation dimension, self-direction and stimulation values oppose security, conformity and tradition values. Both of the former emphasise independent action, thought and readiness for new experience, whereas all of the latter emphasise self-restriction, order and resistance to change. Hedonism shares elements of both openness and self-enhancement. This basic structure was found in samples from 67 nations (Fontaine and Schwartz, 1996; Schwartz, 2005a). It points to the broad underlying motivations that may constitute a universal principle that organises value systems. People differ in the importance they attribute to the ten values. However, the same structure of motivational oppositions and compatibilities organises their values. The idea that values form a motivational circle is unique to the theory. It provides an important tool for relating values to behaviour, attitudes, opinions, social experience and personality. This integrated motivational structure of value relations makes it possible to study how whole systems of values, rather than single values, relate to other variables. Suppose a particular value is relevant to a particular behaviour. Then, both the values adjacent to that value in the value structure and those opposed to it are likely to be relevant to the behaviour. For instance, stimulation values relate positively to readiness to adopt innovative social practices (e.g. early use of the Internet). The values adjacent to stimulation in the value circle – hedonism and selfdirection – also relate positively. In contrast, the opposing values in the structure – conformity, tradition and security – relate negatively to adopting innovations. This exemplifies feature 6 of the conception of values outlined above, which refers to prioritisation of values. The trade-off among the set of relevant competing values guides people’s adoption of innovations.
Comprehensiveness of the ten basic values Do the ten values cover the full range of people’s motivational goals? There is consistent, if not decisive, evidence to support a claim of comprehensiveness. Local researchers in 18 countries added items with significant content that they thought might be missing from the scale. But analyses that included these items pointed to no new basic values. Rather, each added item correlated as expected with the marker items of the basic value whose
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 177
Value orientations: measurement, antecedents and consequences
177
motivational goal it shared. The multi-dimensional analyses in each country also supported the comprehensiveness of the ten basic values. If significant values with unique motivational content were missing, empty regions would appear in the two-dimensional value space. These empty regions would identify gaps in measurement of the motivational continuum. However, no extensive empty regions emerged. So the ten basic values in the theory probably do not exclude any significant, basic value orientations. The relatively comprehensive coverage of basic values recognised across cultures is an important advantage of the theory underlying the ESS value scale.
But are self-reports valid indicators of values? The ESS, like earlier research, uses people’s self-reports to measure the ten values. Might self-reports largely reflect lip-service to values rather than true endorsement? If so, value scores would not relate to behaviours. This chapter presents evidence that the ESS value indexes do indeed relate meaningfully to some behaviours. Extensive evidence of the validity of self-reported values comes from findings using instruments on which the ESS scale drew. As hypothesised, values related to alcohol consumption, delinquent behaviour, risky sexual behaviour, consumer purchases, internet use, mobile phone use, co-operation and competition, inter-group social contact, environmental behaviour, occupational choice, choice of university major, choice of medical specialty, religious observance and voting (sources available in Schwartz, 2005b). This evidence for systematic relations of value priorities to behaviour comes from research in 20 countries.3 Measuring values in the ESS
Development of the Human Values Scale The ESS method for measuring values is a modification of the Portrait Values Questionnaire (Schwartz et al., 1999, 2001; Schwartz, 2005b). The ESS scale includes brief verbal portraits of 21 different people, gendermatched to the respondent. Each portrait describes a person’s goals, aspirations or wishes that point implicitly to the importance of a single value. For example, the following item describes a person for whom self-direction values are important: 3
Although ESS Round 1 was carried out in 22 countries, the Human Values Scale was not asked in Luxembourg or Italy.
Jowell-Chapter-09.qxd
178
3/9/2007
6:44 PM
Page 178
MEASURING ATTITUDES CROSS-NATIONALLY
Thinking up new ideas and being creative is important to her. She likes to do things in her own original way. The following item describes a person who cherishes power values: It is important to him to be rich. He wants to have a lot of money and expensive things. By describing each person in terms of what is important to him or her – the goals and wishes he or she pursues – the verbal portraits capture the person’s values. This method does not explicitly identify values as the topic of investigation. For each portrait, respondents have to answer the question: ‘How much like you is this person?’ choosing one of six labelled boxes ranging from ‘very much like me’ to ‘not like me at all’. Respondents’ own values are inferred from their self-reported similarity to people who are described in terms of particular values. The similarity judgements are transformed into a six-point numerical scale, and the score for the importance of each value is the mean response to the items that measure it. Two portraits operationalise each value, with three for universalism because its content is especially broad. The format of the scale, its instructions, and two exemplary items appear below. Appendix 1 at the end of this chapter lists all of the items and the value they measure. Here we briefly describe some people. Please read each description and think about how much each person is or is not like you. Tick the box to the right that shows how much the person in the description is like you. How much is this person like you?
Very much like me 1
G1 Thinking up new ideas and being creative is important to him. He likes to do things in his own original way. 1 G2 It is important to him to be rich. He wants to have a lot of money and expensive things.
Like me 2
Somewhat like me 3
A little like me 4
Not like me 5
Not like me at all 6
2
3
4
5
6
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 179
Value orientations: measurement, antecedents and consequences
179
The valued goals, aspirations and wishes included in these portraits were selected in three ways: • Building portraits from the conceptual definitions of the basic value (see Table 9.1). For example, the definition of achievement values yielded: “It is important to him to show his abilities. He wants people to admire what he does.” • Paraphrasing items from the earlier abstract value survey. Thus, “protecting the environment” became “He strongly believes that people should care for nature.” • Making abstract terms or phrases from the earlier survey more concrete. For example, “pleasure” became “It is important to him to do things that give him pleasure.”
Methodological issues in designing the scale Why ask respondents to compare the portrait to themselves rather than themselves to the portrait? Asking them to compare other to self directs attention only to the aspects of the other that are portrayed. Thus, the similarity judgement is likely to focus on these value-relevant aspects. In contrast, comparing self to other would instead focus attention on self and might cause respondents to think about the large number of self-characteristics accessible to them (Tversky, 1977; Holyoak and Gordon, 1983; Srull and Gaelic, 1983). Not finding these characteristics in the portrait, respondents might then overlook the similarity of values. Why not directly present the valued goal and obtain a personal importance rating? (An example would be How important is it for you to be tolerant, secure, etc.?) The ESS method has several advantages over this approach. First, few people spend time thinking about what is and is not important to themselves in everyday life. But people do constantly assess others and compare them to self. Second, many people find it difficult to decide what is really important to them, and may be puzzled or disturbed by what they conclude. This elicits self-presentation biases. Third, direct items require a response scale of importance. Such scales must include many points and be stretched at the top to discriminate adequately because people typically rate most values important. Fourth, elderly and less educated respondents commonly find it difficult to translate their value priorities into points on importance scales. Finally, direct items mention only a single, abstract, valued goal. Experience with earlier value scales revealed that it is difficult for people who are unaccustomed to thinking about themselves in abstract terms to respond to such items.
Jowell-Chapter-09.qxd
180
3/9/2007
6:44 PM
Page 180
MEASURING ATTITUDES CROSS-NATIONALLY
Each ESS value item consists of two sentences intended to express a motivation for the same goal. What if people react differently to the two sentences? One sentence most often mentions the importance of a valued goal to the person. The other mentions the person’s feelings toward the goal expressed as an aspiration, desire or intention to reach it. Pre-tests in the Netherlands and Great Britain examined whether the two parts of the items measure the same concept. The questionnaire included the three universalism items, complete or split into their component sentences. Each version appeared twice. A series of analyses led to the conclusion that, for all practical purposes, the importance sentence and the feeling sentence measure the same thing. Moreover, combining the two into one item neither increased nor harmed reliability and validity. We decided to include two sentences in each item based on another consideration. Many respondents to versions of the scale that included single sentence portraits complained that such short portraits were not rich enough to describe a real person to whom they could easily relate.
Correcting for response tendencies Respondents differ in their use of the similarity response scale. Some rate most portraits very similar to themselves, others use the middle of the response scale, and still others rate most portraits dissimilar to themselves. The scale should measure people’s value priorities, the relative importance of the different values. This is because it is the trade-off among relevant values, not the absolute importance of any one value that influences behaviour and attitudes. Suppose two people rate tradition at the same scale point (3), and one rates all other values higher on the scale and the other rates all other values lower. Despite the same absolute score, tradition values obviously have higher priority for the first person than for the second. So to measure value priorities accurately, we must correct individual differences in the use of the response scale. We correct by centring each person’s responses on his or her own mean. This converts absolute value scores into scores that indicate the relative importance of each value to the person (their value priorities – see Schwartz, 1992, 2005a). The steps in centring are firstly to compute a mean score for each value based on the items that index it; secondly to compute the overall mean for all 21 items; and thirdly to subtract the overall mean from the score for each value. This yields ten centred value scores. These centred scores should be used in correlation analyses, in the various forms of analyses of variance, and in regression analyses.4 For other methods of scaling (multi-dimensional, 4
Centring is an ‘ipsatising’ procedure that causes linear dependence among the value scores. Entering all ten values as predictors in regression analyses produces coefficients that may be inaccurate due to multi-colinearity. The r2 for the total variance accounted for by values is accurate. To avoid multi-colinearity, use fewer than ten values as predictors.
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 181
Value orientations: measurement, antecedents and consequences
181
canonical, discriminant or confirmatory factor analyses), raw items scores are appropriate because those techniques handle scale use directly.
Reliability of the ten values The relatively low reliabilities, as shown in Table 9.2, may constitute less of a problem than it seems. The key issue is validity. Considerable evidence of validity is discussed below. Moreover, because the ten values form an integrated structure, one can combine items from motivationally adjacent values to form more reliable indexes of broader value orientations. The reliabilities of the four higher-order value orientations – openness to change,5 conservation, self-transcendence and self-enhancement – are acceptable for short scales. These broader orientations, derived from the structural circle of value motivations, can be used when finer distinctions are not needed. The indexes of the higher-order orientations include sufficient items to permit estimation of reliable latent variables. Table 9.2 Cronbach alpha reliabilities, means and standard deviations of ten basic values and four higher-order values Value
Number of items in Index
Self-direction Stimulation Hedonism Achievement Power Security Conformity Tradition Benevolence Universalism
2 2 2 2 2 2 2 2 2 3
Higher-Order Values
Number of items in Index
Openness to Change Conservation Self-transcendence Self-enhancement
6 6 5 4
Cronbach alpha .48 .64 .67 .70 .44 .61 .58 .36 .54 .57 Cronbach alpha .75 .73 .69 .72
Importance mean .44 -.68 -.20 -.46 -.91 .41 -.14 -.01 .66 .59 Importance mean -.14 .09 .52 -.68
Importance 1 standard deviation .15 (.78) .15 (1.02) .31 (.99) .25 (.95) .14 (.89) .24 (.88) .27 (1.00) .20 (.97) .15 (.66) .11 (.65) Importance 1 standard deviation .16 .19 .10 .17
(.64) (.69) (.45) (.74)
Notes: For the reliability analyses, the N varies slightly around 35,000, due to missing data. The means and standard deviations are computed based on design weights within country and population weights across 20 countries
5
Hedonism is included in the openness to change orientation because it relates more strongly to this higher-order value than to self-enhancement.
Jowell-Chapter-09.qxd
182
3/9/2007
6:44 PM
Page 182
MEASURING ATTITUDES CROSS-NATIONALLY
Round 1 of the ESS provided no data for estimating the test–retest reliability of the ten value indexes. Two studies with the 40-item Portrait Values Questionnaire, the source of many ESS items, are relevant (Schwartz, 2005b). In a German student sample, stability of the ten values over six weeks ranged from .62 to .82 (median .75). A French sample showed considerable stability even after two years (.50 to .66, median .61). Undoubtedly, real value change leads to some of the instability. These data suggest that the portrait method of measuring values elicits quite consistent responses across occasions. Value structures in the ESS countries The geometric structure of relations among the value items in a multi-dimensional space also sheds light on the reliability and validity of the indexes. Items that share a similar motivational significance should be close to one another in such a space. Items with opposing motivational significance should be distant from one another. Figure 9.2 presents a two-dimensional representation of the relations among the items for all 35,161 respondents who completed the value survey in ESS Round 1. This representation is based on Smallest Space Analysis (Guttman, 1968; Borg and Shye, 1995). Other multi-dimensional scaling (MDS) techniques yield similar representations. Items are numbered as in the list in Appendix 1 at the end of this chapter. In the data-based Figure 9.2, the items intended to measure each value are indeed close to one another and distant from those that express competing motivations. By adding partition lines and labels in the space, I highlight the fact that the items form ten distinct regions, one for each value. The observed pattern justifies forming indexes of each value that combine the items originally intended to measure it. This supports both the reliability and the construct validity of the indexes. A comparison of the data-based Figure 9.2 with the theory-based Figure 9.1 shows that the pattern obtained across ESS countries replicates the circular motivational structure of relations among the ten values.6 This means that, across Europe, people who give high priority to universalism values tend to emphasise benevolence values too; but they give low priority to power and achievement values. In turn, those who emphasise power and achievement values tend to de-emphasise universalism and benevolence. Those who give high priority to self-direction values also tend to emphasise stimulation and 6
There is one minor variation. Tradition values emerge between benevolence and conformity values rather than behind the latter. Tradition still forms part of the higher-order conservation value with conformity and security to which it is closest.
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 183
Value orientations: measurement, antecedents and consequences
183
Figure 9.2 Multi-dimensional Space Analysis (SSA) of 21 value items across 20 ESS countries in Round 1 (N = 35,161), coefficient of alienation .11
hedonism values, but give low priority to tradition, conformity and security values. And those who emphasise the latter values de-emphasise the former. In sum, two dimensions organise Europeans’ value priorities: self-transcendence vs. self-enhancement, and openness to change vs. conservation. This is the same pattern of compatible and conflicting values found around the world – in Africa, Asia, Oceania, and North and South America (Schwartz, 2005a). Figure 9.2 gives the pattern averaged across the ESS countries. Does the pattern in each country conform to the typical world-wide picture? Separate analyses in each of the 20 countries yielded structures very similar to Figure 9.2. In 15 countries, the ten values formed ten distinct regions. In the remaining five countries, eight values formed distinct regions and the items of two values were intermixed. In each case, these were two values adjacent in the motivational circle. This implies no real deviation from the motivational circle. The observed structure in all 20 countries allowed the combination of values into the two pairs of opposing higher-order orientations. Table 9.3 provides fuller details of the structure in each country. In sum, the structure of relations among the ten values is virtually the same across Europe.
Jowell-Chapter-09.qxd
3/9/2007
184
6:44 PM
Page 184
MEASURING ATTITUDES CROSS-NATIONALLY
Table 9.3 Summary of structural analyses of value items in 20 countries (see note below for key to values) # distinct regions
Deviations from Europewide structure
Items misplaced with wrong value
Austria Belgium Czech Republic Denmark Finland France Germany Greece Hungary
8 10 8 10 10 8 10 10 10
None None None None None None None None #8 with security values
Ireland Israel Netherlands Norway Poland Portugal
8 10 10 10 10 10
Slovenia Spain Sweden Switzerland United Kingdom
10 10 8 10 10
Hed/Sti combined none Ben/Univ combined none none Conf/Sec combined none Sec, Trad, Conf reversed Hed, Ach reversed Sec, Conf, Trad reversed Hed/Sti combined none none none none Sti/Hed reversed Sec, Trad, Conf reversed Sec/Conf reversed none Sec/Conf combined none none
Country
None None None None #14 with universalism values None None None None None None
Notes: All sets of combined and reversed values are adjacent in the circular motivational structure. Key: Hed = Hedonism, Sti = Stimulation, Ben = Benevolence, Univ = Universalism, Conf = Conformity, Sec = Security, Trad = Tradition, Ach = Achievement
Value priorities in the ESS countries The structural analyses establish the near equivalence of meaning of the ten values across countries. We can therefore compare the value priorities of individuals in different countries. Figure 9.3 presents the relative importance of the ten values across countries (see also columns 3 and 4 of Table 9.2). Benevolence, universalism, self-direction, and security values are most important and power values least important. This corresponds with earlier findings from around the world (Schwartz and Bardi, 2001).7 Schwartz and Bardi (2001) explicate how the relative importance of the ten values may reflect their differential adaptive functions in meeting the requirements of successful societal functioning. 7
Comparisons with data obtained with the 56- and 57-item abstract value survey and the 40-item portrait value survey suggest that the brief ESS value scale may underestimate the relative importance of achievement values and overestimate the relative importance of tradition values. This does not affect cross-national comparisons or associations of values with other variables.
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 185
Value orientations: measurement, antecedents and consequences
Figure 9.3
185
Mean importance of values across 20 countries
Notes: Ben = Benevolence, Univ = Universalism, SDir = Self-direction, Sec = Security, Trad = Tradition, Conf = Conformity, Hed = Hedonism, Ach = Achievement, Sti = Stimulation, Pow = Power
There is substantial cross-national variation in average value priorities. (Appendix 2 at the end of this chapter presents the value profiles of the 20 countries.) Figure 9.4 summarises the country similarities and differences graphically, using the co-plot extension of MDS (Goldreich and Raveh, 1993). The figure maps the countries based on the priority their populations give to the four higher-order value orientations. Using only the four orientations simplifies the presentation and ensures that the measures have equivalent meaning across countries. On this map, the closer together any two countries are, the more similar are the priorities of their populations across the four orientations.8 For example, on the left, Switzerland and Denmark are quite similar on all four orientations, as are Hungary and Slovenia on the right.
8
Distances on the map represent the sum of the squared differences between pairs of nations, across the four orientations. The co-plot adds to the MDS arrows that identify a vector for each orientation. The vectors indicate the direction in the space that best captures the order of countries from highest to lowest on the orientation. By dropping a perpendicular from each country’s location to a vector (extensions of vectors through the centre are not shown), the order of countries on each orientation is shown. The map locates the relative positions of the countries and their positions on each vector quite accurately. The stress value for the MDS is .07, and the locations of countries on the vectors correlate with the country’s actual scores on each orientation from .95 to .99.
Jowell-Chapter-09.qxd
186
3/9/2007
6:44 PM
Page 186
MEASURING ATTITUDES CROSS-NATIONALLY
Figure 9.4 Co-plot mapping of 20 countries on four higher-order value orientations
The arrows pointing to the four orientations indicate the direction of increasing priority for each orientation. Thus, the farther toward the upper left a country is, the more importance its population attributes to openness (change) and the less to conservation (status quo) values. The Danes give comparatively high priority to openness and low priority to conservation values. Meanwhile, the Poles give comparatively high priority to conservation and low priority to openness values. All populations attribute higher importance to self-transcendence (other-enhancing) than to self-enhancement (self-promoting) values. However, the Israelis give comparatively low priority to self-transcendence and high priority to self-enhancement, whereas the French exhibit the opposite pattern. The map reveals that countries with common cultural and historical backgrounds share some, though not all, value priorities. The former communist countries form a band on the right of the map. Their populations give comparatively low priority to openness values and high priority to conservation values. With the exception of the Czechs, they also give high priority to self-enhancement and low priority to self-transcendence values. This pattern may reflect the experience of totalitarian rule, Roman Catholic and pre-communist agrarian traditions, and relatively low income. These experiences limited opportunities for individual freedom of expression and life-style. People may also have concentrated attention on their own needs to survive and get ahead to the exclusion of others.
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 187
Value orientations: measurement, antecedents and consequences
187
The Greeks and, to a lesser extent, the Spanish and Portuguese populations exhibit value priorities similar to the East and Central Europeans. These Southern European countries also share the experience of a strongly hierarchical religion, a period of totalitarian rule until the mid-1970s and, until recently, lower incomes than other West European countries. Other shared historical experiences may also account for the closeness of Portugal and Spain. Adjacent, in the centre of the map, are the two English-speaking countries with strong historical and linguistic ties, the UK and Ireland. Compared with the other West European countries, their populations give high priority to self-enhancement values and low priority to self-transcendence values. These priorities may reflect their liberal welfare regimes that give freer reign to market forces and require individuals to fend more for themselves (Esping-Anderson, 1990). Within the remaining West European countries, no simple pattern emerges. All are at or above the mean on self-transcendence and, excepting Austria, below the mean on self-enhancement. All but Finland are below the mean on conservation, and all but Norway are above the mean on openness. These priorities are fitting for populations in welfare states, but there is no apparent distinction between corporatist and social democratic regimes. If the Swiss are split into French- and German-speaking sub-samples (not shown), the former are located closer to France and the latter closer to Germany on the map. Similarly, the sub-sample of French-speaking Belgians is closer to France- and the sub-sample of Dutch-speakers closer to the Netherlands. These affinities suggest the importance of cultural factors as determinants of values. Deeper analyses are needed to understand the exceptions to the general pattern and the other variations within this region. The Israeli pattern of relatively high self-enhancement and low self-transcendence characterises both the Jewish and, even more, the Arab populations. This focus on advancing self-interests may reflect a response to the threat of active hostility from neighbours. World-wide comparisons of cultural value orientations (Schwartz, 2004) indicate that this type of pattern is common in countries that are in early or middle phases of escape from weak economies and moving toward capitalist development (e.g. China, Uganda). Here, the pattern also characterises less affluent European countries (except the Czech Republic). Across 20 ESS countries, national affluence, measured by GDP per capita in 1999, correlated positively with self-transcendence (r = .65) and openness (r = .74) values, and negatively with conservation (r = -.75) values, but not significantly with self-enhancement (r = -.32) values. The value profiles of the country populations in 2002–3 are baselines against which to assess social change in Europe over the coming years. These basic value priorities underlie a wide variety of more specific social values, attitudes and behaviour. Any changes that occur over time in national scores on the basic
Jowell-Chapter-09.qxd
188
3/9/2007
6:44 PM
Page 188
MEASURING ATTITUDES CROSS-NATIONALLY
values are likely to reflect profound social change in societies. For example, trends away from social democratic welfare regimes toward more liberal regimes may produce a drop in openness and self-transcendence values and an increase in conservation and self-enhancement values. And increasing ethnic and religious conflict may have similar effects. The basic values of populations also facilitate or inhibit social change. Policies with implications for pursuing self-direction or conformity values, for example, may meet quite different receptions in Switzerland and Spain because of differences in the basic value priorities of their populations. The average value priorities of national populations are meaningful for understanding national differences. Nonetheless, within every country some people give high priority to values their fellow nationals consider unimportant and others give little or no priority to the values their fellows consider most important. Indeed, there is substantially greater variation within than between countries in value priorities and in attitudes and behaviour too. So I turn next to a brief examination of some sources of individual differences in basic values. Sources of individual differences in basic values
Age and the life course As people grow older, they tend to become more embedded in social networks, more committed to habitual patterns, and less exposed to arousing and exciting changes and challenges (Glen, 1974; Tyler and Schuler, 1991). This implies that conservation values (tradition, conformity, security) should increase with age, while openness to change values (self-direction, stimulation, hedonism) should decrease. Once people begin starting families of their own and attain stable positions in the occupational world, they tend to become less preoccupied with their own strivings and more concerned with the welfare of others (Veroff et al., 1984). This implies that self-transcendence values (benevolence, universalism) increase with age a self-enhancement values (power, achievement) decrease. The first column of the summary Table 9.4 reports correlations of age with values across the 20 ESS countries included in this analysis. The number of countries in which the correlation was in the same direction as the overall correlation appears in parentheses. All the observed correlations confirm the expected associations and support the probable processes of influence. All associations are monotonic and all but two are linear. Universalism rises sharply up to ages 40–50 but barely increases thereafter. Self-direction drops significantly only after age 60.
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 189
Value orientations: measurement, antecedents and consequences
189
Gender Various theories of gender difference lead researchers to postulate that men emphasise ‘agentic-instrumental’ values such as power and achievement, while females emphasise ‘expressive-communal’ values such as benevolence and universalism (Schwartz and Rubel, 2005). Most theorists expect these differences to be small. Column 3 of summary Table 9.4 supports expectations regarding both the nature and the strength of value relations to gender. Table 9.4 Correlations of the ten values with age, gender, education and income in 20 countries Value Security Conformity Tradition Benevolence Universalism Self−direction Stimulation Hedonism Achievement Power
Age (N = 35,030) .26 .32 .33 .13 .15 −.08 −.37 −.33 −.26 −.09
(20) (20) (20) (20) (19) (15) (20) (20) (20) (18)
Gender (Female) (N = 35,165) .11 (20) .02 (13) .08 (20) .18 (20) .12 (20) −.06 (19) −.09 (20) −.06 (18) −.12 (20) −.14 (19)
Education (N = 34,760) −.20 −.22 −.22 −.04 .06 .19 .16 .08 .14 .02
(20) (20) (20) (11) (16) (20) (19) (15) (20) (13)
Income (N = 28,275) −.12 (20) −.14 (20) −.16 (20) −.05 (15) −.01 (14) .10 (18) .11 (18) .08 (19) .12 (19) .08 (19)
Notes: Correlation is not significantly different from zero. Figures in parentheses are the number of countries with correlations in the indicated direction
Education Educational experiences presumably promote the intellectual openness, flexibility, and breadth of perspective essential for self-direction values (Kohn and Schooler, 1983). These same experiences increase openness to non-routine ideas and activity central to stimulation values. In contrast, these experiences challenge unquestioning acceptance of prevailing norms, expectations and traditions, thereby undermining conformity and tradition values. The increasing competencies to cope with life that people acquire through education may also reduce the importance of security values. Column 3 of summary Table 9.4 reveals the expected positive correlations of years of formal education with self-direction and stimulation values and negative correlations with conformity, tradition and security values. In addition, education is associated positively with achievement values. The constant grading and comparing of performance in schools, emphasising
Jowell-Chapter-09.qxd
190
3/9/2007
6:44 PM
Page 190
MEASURING ATTITUDES CROSS-NATIONALLY
external standards, could account for this. The associations of education with values are largely linear, with the exception of universalism values. Universalism values begin to rise only in the last years of secondary school. They are substantially higher among those who attend university. This may reflect both the broadening of horizons that university education provides and a tendency for those who give high priority to universalism values to seek higher education.
Income Affluence creates opportunities to engage in discretionary activities and to choose one’s life-style freely. It reduces security threats and the need to restrict one’s impulses and to maintain supportive, traditional ties. Higher income should therefore promote the prioritisation of stimulation, selfdirection, hedonism and achievement values over security, conformity and tradition values. The correlations between total household income (12 categories) and value priorities, in column 4 of summary Table 9.4, support these expectations. Income did indeed contribute to higher stimulation, self-direction, achievement and power values, primarily in the upper third of the income distribution. Table 9.4 also lists the values in an order corresponding to their position around the circular structure of value relations (cf. Figure 9.1). The patterns of correlation in Table 9.4 illustrate two features of the relations of values to other variables derived from the circular motivational structure. First, any outside variable tends to have similar associations with values that are adjacent in the value circle. And secondly, associations with any outside variable decrease monotonically around the circle in both directions, from the most positively associated to the least positively associated value. The correlations in Table 9.4 generally exhibit both these features. This is evidence that the ESS value scale measures value priorities in a way that reflects their underlying motivational continuum. Basic values as a predictor of national and individual variation in attitudes and behaviour
Attitudes to immigration Attitudes toward immigration are a major concern in Europe today. Three ESS items measured opposition to accepting ‘other’ immigrants – those of a different race/ethnic group and from poorer European and non-European
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 191
Value orientations: measurement, antecedents and consequences
191
countries. A summary index of these items revealed great variation in levels of opposition across countries.9 Still, country differences accounted for only 12 per cent of the variation in opposition, while differences between individuals within countries accounted for 88 per cent. This illustrates a general fact: even when there is significant variation across countries on an important attitude or behaviour, most of the variation is at the individual level. Hence, to understand the sources of behaviours and attitudes, it is important to study both the individual and the country levels simultaneously. For this purpose, I employ hierarchical linear modelling (HLM5: Bryk and Raudenbush, 2002) at two levels (individual and country). I jointly examine individual and country level sources of attitudes toward immigrants, of components of social capital, and of political activism. The HLM analyses compute regressions both within and across countries. Within country, they estimate the effects on attitudes and behaviours of individual differences like age, education, and personal values.10 Controlling these individual-level effects, the analyses estimate the effects of national characteristics like GDP per capita and average values on country-level differences in attitudes and behaviours. All analyses reported here gave a wide range of individual characteristics a chance to predict variance in attitude or behaviour, including age, gender, years of education, household income, marital status, religiosity, foreign born, ever unemployed three or more months, living in large city, and other individual value priorities expected to be relevant in particular cases. In addition, all analyses gave a wide range of national characteristics the chance to predict variance by country, including GDP per capita, Human Development Index (HDI), life expectancy at birth, proportion of population below age 15, proportion above age 64, per cent living in big cities (all in 1999), per cent annual GDP growth 1990–9, average annual inflation rate 1990–9, scores on two Inglehart cultural dimensions – tradition vs. secular-rational, survival vs. selfexpression, score on Hofstede’s individualism–collectivism cultural dimension, and the Schwartz cultural value orientations expected to be relevant.11 Appendix 3 at the end of this chapter summarises the statistical findings. 9
Only the 15 West European countries are in this analysis, because immigration to East Europe and to Israel has different characteristics and meanings. 10 HLM provides more accurate estimates of effects because, when estimating regression coefficients, it controls the differential reliability of within-nation relationships due to differences in sample size. 11 The Inglehart cultural dimension scores are those in Inglehart and Baker (2000); the Hofstede scores are from Hofstede (2001, pp.500–501) the Schwartz cultural value orientation scores were computed based on indexes from culture-level analyses of the ESS data (scores available from the author); and the other variables come from the ESS macro-data file.
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
192
Page 192
MEASURING ATTITUDES CROSS-NATIONALLY
Individuals who gave higher priority to universalism values, lower priority to security and conformity values, were more educated and older, more likely to be foreign-born, and more willing to accept ‘others’ as immigrants into their country. Universalism values were the strongest predictor, followed by education and security values. Thus, the trade-off between giving high priority to promoting the welfare of all others (universalism values) and avoiding personal, national and interpersonal threat (security and conformity values) influences readiness to accept ‘other’ immigrants, over and above an individual’s socio-demographic characteristics. Eight national characteristics correlated with country differences in acceptance of immigrants. However, only three predicted acceptance in the country-level regression. The Schwartz cultural value orientation of egalitarianism predicted most strongly, followed by the Human Development Index and then by the per cent living in big cities. Together, these three explained 62 per cent of the variation between countries. None of the other economic, social, demographic or cultural characteristics of nations added significant predictive power. Societal egalitarianism values refer to a cultural emphasis on preserving the social fabric by voluntarily transcending selfish interests and promoting others’ welfare (Schwartz, 2004). They are measured by the average societal importance of such value items as equality, social justice, responsibility, help and honesty. Egalitarianism values are the culturelevel parallel to individual-level universalism values.12 Thus, individuals’ universalism values and the parallel country-level egalitarianism values were the most significant predictors of accepting ‘other’ (potentially disruptive?) immigrants into one’s country.
Interpersonal trust Values are also important for predicting components of social capital, of “connections among individuals–social networks and the norms of reciprocity and trustworthiness that arise from them” (Putnam, 2000, p.19). First, consider interpersonal trust. Two ESS items measured this component. One asked whether most people can be trusted, the other whether most people generally try to take advantage of you. Country differences accounted for 19 per cent of the variation in a summary index of these trust items, individual differences for 81 per cent. Greater education predicted individual differences in trust most strongly, followed by security values (negative) and universalism values (positive). 12 For a discussion of the difference between value dimensions that discriminate at the individual and culture levels and of the importance of making this distinction, see Smith and Schwartz (1997).
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 193
Value orientations: measurement, antecedents and consequences
193
Other predictors, in order of importance, were never having been unemployed, higher income, benevolence values and religiosity (all positive). Seven national characteristics correlated substantially with country differences in interpersonal trust. Two of the Schwartz cultural value orientations predicted most strongly in the country-level regression: embeddedness predicted negatively followed by egalitarianism positively. Of the other national characteristics, only average life expectancy added significant predictive power. Embeddedness values refer to a cultural emphasis on maintaining the status quo and restraining actions that might disrupt in-group solidarity or the traditional order (Schwartz, 2004). They are the culture-level parallel to individuallevel conservation values. They are measured by the average societal importance of such value items as social order, respect for tradition, security, obedience and wisdom. The more important are embeddedness values, so the less important is interpersonal trust in the country. An emphasis on embeddedness values expresses the cultural assumption that strong social controls should be the vehicle for preserving personal and social ties and preventing anomie. In contrast, the assumption underlying a cultural emphasis on egalitarianism values is that people can and should be socialised to transcend their selfish interests voluntarily and promote the welfare of others. Thus, interpersonal trust is greater to the extent that the cultural value orientations in a country emphasise the expectation that individuals will voluntarily promote the welfare of others rather than the expectation that social controls are necessary to prevent the breakdown of interpersonal ties. The prevailing level of trust in a society probably also feeds back to increase or decrease these two cultural value orientations. Together with life expectancy, embeddedness and egalitarianism value orientations explained 43 per cent of the variation in interpersonal trust between countries. Thus, individuals’ security and universalism values and the parallel country-level embeddedness and egalitarianism values were among the most important predictors of interpersonal trust.
Social involvement A combination of two items reflected a behavioural component of social capital – engaging in social activities and meeting socially with friends, relatives and colleagues. Country differences accounted for 8 per cent of the variation, individual differences for 92 per cent. Being younger and giving priority to hedonism values were the strongest predictors of social involvement. Stimulation values, benevolence values and greater education followed in order. Interestingly, hedonism and benevolence, values that are conceptually unrelated and important to different people, both predicted social involvement. This reflects the fact that different types of social activity can produce
Jowell-Chapter-09.qxd
194
3/9/2007
6:44 PM
Page 194
MEASURING ATTITUDES CROSS-NATIONALLY
high social involvement. The predictive importance of hedonism and stimulation values implies that activity that satisfies motivations for pleasure and excitement is most prominent, even when age and education are held constant. The importance of benevolence values suggests that interest in others close to you is the other main motivation for social involvement. Nine country characteristics correlated with differences between countries in social involvement. But only the human development index (HDI) and cultural or egalitarian values entered the country-level regression at the .01 level of significance. Each of these country characteristics, taken separately, accounted for the same amount of variance between countries in social involvement – 49 per cent. No other variable added significantly to the prediction of either one of these. It is probably reasonable to consider both as predictors. Together they accounted for 56 per cent of the between-country variance. The higher the HDI and the greater the emphasis on egalitarianism values in the society, the greater is the social involvement. Higher levels of human social and economic development may make resources of time and money available that enable people to engage more in social activities. A cultural emphasis on egalitarianism values encourages people to view all sorts of others – including those beyond the in-group – as moral equals (Schwartz, 2004) with whom one may legitimately engage in social activities. This cultural emphasis also legitimises cultivation of personal interests, thereby encouraging social activity based on shared interests and not only shared kinship.
Organisational membership A third common aspect of social capital is membership in voluntary organisations. The number of memberships in 12 types of organisation (sports, humanitarian, labour, religious, etc. – see chapter 10) indexed this variable. Country differences accounted for 21 per cent of the variation in membership, individual differences for 79 per cent. Most important was education, followed by income, age, religiosity, and being male. Only four values – universalism, benevolence, stimulation and self-direction – predicted overall membership, all weakly. This is because values may motivate joining some types of organisation in a positive direction and others in a negative direction. Examination of subtypes of organisation revealed that the two self-transcendence values predicted joining humanitarian and environmental organisations especially strongly, and the two ‘openness to change’ values predicted joining cultural, sports and hobby groups especially strongly. Conservation values inhibited joining both subtypes of organisation, creating motivational trade-offs with self-transcendence and openness values.
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 195
Value orientations: measurement, antecedents and consequences
195
Membership in voluntary organisations was greater in countries lower in cultural embeddedness values and higher in the HDI. Together, these variables explained 66 per cent of the variance in membership. The influence of HDI probably results from the fact that people are less preoccupied with making a basic living in countries where there are greater resources of time and money, so more people find it feasible to join voluntary organisations. Cultural embeddedness values may discourage unnecessary involvement with people outside the broad in-group; instead, they emphasise loyalty and devotion to the in-group, which would not encourage membership in voluntary groups in the wider society. Many different individual characteristics predicted one or another of the social capital indexes, though only education and benevolence values predicted all three. At least three basic values explained significant variance in each social capital index. Overall, however, the individual characteristics examined here explained only a small proportion of the individual-level variance. Differences in the opportunities and constraints to which individuals are exposed doubtlessly account for much additional variance.
Political activism A final illustration of the effects of basic values on behaviour is on political activism. This was measured as the number of politically relevant, legal acts that respondents reported performing in the 12 past months out of nine (e.g. contacting a politician, participating in a public demonstration, boycotting a product – see chapter 10 again). As an individual-level predictor, I included subjective political efficacy (the mean of two standardised items – individuals’ beliefs they are able to take an active role in a political group, and their reported ability to make up their mind on political issues). This permitted a test of whether values interact with self-reported efficacy in their effect on activism. Such interactions would support the view that values motivate behaviour. Country differences accounted for 10 per cent of the variation in political activism, individual differences for 90 per cent. Understandably, political efficacy was the strongest individual-level predictor, perhaps in part because respondents infer their efficacy from their activism. Next was education followed by four values. The strongest was universalism values which promote social justice and environmental preservation – the goals of much recent activism. The positive influence of self-direction and stimulation and the negative influence of conformity values reflect the fact that activism is uncommon, exciting and risky. But the interactions of subjective efficacy with universalism, self-direction and stimulation values also predicted activism. As hypothesised, the more efficacious that the individual felt, so the greater was the impact.
Jowell-Chapter-09.qxd
3/9/2007
196
6:44 PM
Page 196
MEASURING ATTITUDES CROSS-NATIONALLY
Although 11 national characteristics correlated with country differences in political activism, only two entered the country-level regression. The HDI and the cultural value orientation of intellectual autonomy explained 75 per cent of the variation between countries. The intellectual autonomy cultural value orientation assumes that individuals are unique entities who should be encouraged to cultivate and pursue their own ideas and intellectual directions independently (Schwartz, 2004). It is measured by the average societal importance of such value items as broadmindedness, curiosity and creativity. Not surprisingly, social and economic development, often accompanied by increased democratisation, foster political activism. However, a cultural value atmosphere that encourages autonomous thought and action makes an additional contribution. Conclusion This chapter has presented a theory of ten basic human values that have guided development of the ESS human values scale. These values form a coherent system that captures the conflicts among the basic motivations that guide people’s attitudes and behaviour. But a prerequisite for comparing national value profiles is evidence that the values have equivalent meanings across countries. Structural analyses of relations among the value items demonstrated such equivalence for the ten values, so researchers are urged to use the ten values or their combinations into higher-order orientations when studying relations of values to other variables. This permits drawing on the theory to generate hypotheses and explanations and using the trade-offs among motivationally opposed values that influence attitudes, opinions and behaviour, rather than offering ad hoc analyses based on value items. Methodological explanations and instructions to facilitate the application of the value scale have been given. Most crucial is the correction for differences in use of the response scale, which transforms absolute scores into value priorities. This correction is crucial for obtaining accurate value comparisons among individuals or groups and accurate associations with other variables. Analyses using this correction have revealed systematic relations between age, gender, education and income to people’s value priorities. They also mapped and compared the average individual value priorities in countries, which may link to national and regional political and economic history, language, religion, and welfare systems. Developing in-depth explanations of national value priorities and of changes in them that emerge in later ESS rounds remains a challenge for future analysis.
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 197
Value orientations: measurement, antecedents and consequences
197
A key message of this chapter is the important role that value priorities can play in explaining socially significant attitudes and behaviour at both the individual and the country level. We examined attitudes toward accepting immigrants, interpersonal trust, social involvement, organisational membership and political activism. Individuals’ relevant value priorities predicted each of these within countries. Cultural value orientations, also derived from the ESS human values scale, were impressive predictors of national differences in each of these attitudes or behaviours, after controlling for individual-level effects. Indeed, cultural values predicted as consistently and strongly as the index of social and economic development (HDI). Cultural values out-performed ten other country characteristics that are frequently used to explain national differences. Values, as measured by the human values scale, are thus a powerful tool for research with the ESS. References Allport, G.W. (1961), Pattern and Growth in Personality, New York: Holt, Rinehart and Winston. Allport, G.W., Vernon, P.E., and Lindzey, G.A. (1960), A Study of Values, Boston: Houghton Mifflin. Bales, R.F. and Couch, A. (1969), ‘The value profile: A factor analytic study ovalue statements’, Sociological Inquiry, 39, pp.3–17. Borg, I. and Shye, S. (1995), Facet Theory: Form and Content, London: Sage. Braithwaite, V.A. and Law, H.G. (1985), ‘Structure of human values: Testing the adequacy of the Rokeach Value Survey’, Journal of Personality and Social Psychology, 49, pp.250–263. Braithwaite, V.A. and Scott, W.A. (1991), ‘Values’ in: J.P. Robinson, P. Shaver and L. Wrightsman (eds), Measures of personality and social psychological attitudes, New York: Academic Press, pp.661–753. Bryk, A.S. and Raudenbush, S.W. (2002), Hierarchical Linear Models: Applications and Data Analysis Methods, 2nd edition, Newbury Park, CA: Sage. Chinese Culture Connection (1987), ‘Chinese values and the search for culture-free dimensions of culture’, Journal of Cross-Cultural Psychology, 18, pp.143–164. Clarke, H.D., Kornberg, A., McIntyre, C., Bauer-Kaase, P. and Kasse, M. (1999), ‘The effect of economic priorities on the measurement of value change: New experimental evidence’, American Political Science Review, 93, pp.637–647. Dempsey, P. and Dukes, W.F. (1966), ‘Judging complex value stimuli: An examination and revision of Morris’s “Paths of Life”’, Educational and Psychological Measurement, 26, pp.871–882. Esping-Anderson, G. (1990), The Three Worlds of Welfare Capitalism, Princeton, NJ: Princeton University Press. Feather, N.T. (1985), Values in Education and Society, New York: Free Press.
Jowell-Chapter-09.qxd
198
3/9/2007
6:44 PM
Page 198
MEASURING ATTITUDES CROSS-NATIONALLY
Fitzsimmons, G.W., Macnab, D. and Casserly, C. (1985), Technical Manual of the Life Roles Inventory Values Scale and the Salience Inventory, Edmonton, Canada: PsiCan Consulting. Fontaine, J. and Schwartz, S.H. (1996), ‘Universality and Bias in the Structure of Psychological Questionnaire Data’, Paper presented at the XIII Congress of the International Association of Cross-Cultural Psychology, Montreal, Canada, August, 1996. Glen, N.D. (1974), ‘Aging and conservatism’, Annuals of the American Academy of Political and Social Science, 415, pp.176–186. Goldreich, Y. and Raveh, A. (1993), ‘Coplot display technique as an aid to climatic classification’, Geographical Analysis, 25, pp.337–353. Gordon, L.V. (1967), Survey of Interpersonal Values, Chicago: Science Research Associates. Guttman, L. (1968), ‘A general nonmetric technique for finding the smallest coordinate space for a configuration of points’, Psychometrica, 33, pp.469–506. Halman, L. and de Moor, R. (1994), ‘Value shift in Western societies’ in: P. Ester, L. Halman, and R. de Moor (eds), The Individualizing Society: Value Change in Europe and North America, Tilburg: Tilburg University Press. Hofstede, G. (1980), Culture’s Consequences: International Differences in WorkRelated Values, Beverly Hills, CA: Sage. Hofstede, G. (2001), Culture’s Consequences: Comparing Values, Behaviours, Institutions, and Organizations across Nations, London: Sage. Holyoak, K.J. and Gordon, P.C. (1983), ‘Social reference points’, Journal of Personality and Social Psychology, 44, pp.881–887. Inglehart, R. (1997), Modernization and Postmodernization: Cultural, Economic and Political Change in 43 Countries, Princeton, NJ: Princeton University Press. Inglehart, R. and Baker, W.E. (2000), ‘Modernization, cultural change, and the persistence of traditional values’, American Sociological Review, 65, pp.19–51. Kluckhohn, C.K.M. (1951), ‘Values and value orientations in the theory of action’ in T. Parsons and E. Shils (eds), Toward a General Theory of Action, Cambridge, MA: Harvard University Press. Kohn, M. (1969), Class and Conformity: A Study in Values, Homewood, IL: Dorsey Press. Kohn, M.L. and Schooler, C. (1983), Work and Personality: An Inquiry into the Impact of Social Stratification, Norwood, NJ: Ablex. Lorr, M., Suziedelis, A. and Tonesk, X. (1973), ‘The structure of values: Conceptions of the desirable’, Journal of Research in Personality, 7, pp.137–147. Morris, C.W. (1956), Varieties of Human Value, Chicago: University of Chicago Press. Putnam, R.D. (2000), Bowling Alone: The Collapse and Revival of American Community, New York: Simon and Schuster. Rokeach, M. (1973), The Nature of Human Values, New York: Free Press. Schwartz, S.H. (1992), ‘Universals in the content and structure of values: Theory and empirical tests in 20 countries’ in: M.P. Zanna (ed.), Advances in Experimental Social Psychology, Vol. 25, New York: Academic Press.
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 199
Value orientations: measurement, antecedents and consequences
199
Schwartz, S.H. (1996), ‘Value priorities and behaviour: Applying a theory of integrated value systems’ in: C. Seligman, J.M. Olson and M. P. Zanna (eds), The Psychology of Values: The Ontario Symposium, Vol. 8, Hillsdale, NJ: Erlbaum. Schwartz, S.H. (2004), ‘Mapping and interpreting cultural differences around the world’ in: H. Vinken, J. Soeters and P. Ester (eds), Comparing Cultures, Dimensions of Culture in a Comparative Perspective, Leiden, The Netherlands: Brill. Schwartz, S.H. (2005a), ‘Basic human values: Their content and structure across countries’ in: A. Tamayo and J.B. Porto (eds), Valores e Comportamento nas Organizações, [Values and Behaviour in Organizations] Petrópolis, Brazil: Vozes, pp.21-55. Schwartz, S.H. (2005b), ‘Robustness and fruitfulness of a theory of universals in individual human values’ in: A. Tamayo and J.B. Porto (eds.), Valores e Comportamento nas Organizações, [Values and Behaviour in Organizations] Petrópolis, Brazil: Vozes, pp.56–95. Schwartz, S.H. and Bardi, A. (1997), ‘Influences of adaptation to communist rule on value priorities in Eastern Europe’, Political Psychology, 18, pp.385–410. Schwartz, S.H. and Bardi, A. (2001), ‘Value hierarchies across cultures: Taking a similarities perspective’, Journal of Cross Cultural Psychology, 32, pp.268–290 Schwartz, S.H. and Bilsky, W. (1987), ‘Toward a universal psychological structure of human values’, Journal of Personality and Social Psychology, 53, pp.550–562. Schwartz, S.H. and Boehnke, K. (2004), ‘Evaluating the structure of human values with confirmatory factor analysis’, Journal of Research in Personality, 38, pp.230–255. Schwartz, S.H., Lehmann, A. and Roccas, S. (1999), ‘Multimethod probes of basic human values’ in: J. Adamopoulos and Y. Kashima (eds), Social Psychology and Culture Context: Essays in Honor of Harry C. Triandis, Newbury Park, CA: Sage. Schwartz, S.H., Melech, G., Lehmann, A., Burgess, S. and Harris, M. (2001), ‘Extending the cross-cultural validity of the theory of basic human values with a different method of measurement’, Journal of Cross Cultural Psychology, 32, pp.519–542. Schwartz, S.H. and Rubel, T. (2005), Sex differences in value priorities: Cross-cultural and multi-method studies, Journal of Personality and Social Psychology, 89, pp.1010–1028. Scott, W.A. (1965), Values and Organizations: A Study of Fraternities and Sororities, Chicago: Rand McNally. Smith, P.B. and Schwartz, S.H. (1997), ‘Values’ in: J.W. Berry, M.H. Segall and C. Kagitcibasi (eds), Handbook of Cross-Cultural Psychology, Vol. 3, 2nd edition, Boston: Allyn and Bacon pp.77–118. Srull, T.K. and Gaelick, L. (1983), ‘General principles and individual differences in the self as a habitual reference point: An examination of self-other judgments of similarity’, Social Cognition, 2, pp.108–121. Tetlock, P.E. (1986), ‘A value pluralism model of ideological reasoning’, Journal of Personality and Social Psychology, 50, pp.819–827.
Jowell-Chapter-09.qxd
200
3/9/2007
6:44 PM
Page 200
MEASURING ATTITUDES CROSS-NATIONALLY
Tversky, A. (1977), ‘Features of similarity’, Psychological Review, 84, pp.327–352. Tyler, T.R. and Schuler, R.A (1991), ‘Aging and attitude change’, Journal of Personality and Social Psychology, 61, pp.689–697. Veroff, J., Reuman, D. and Feld, S. (1984), ‘Motives in American men and women across the adult life span’, Developmental Psychology, 20, pp.1142–1158. Williams, R.M., Jr. (1968), ‘Values’ in: Sills, E. (ed.), International Encyclopedia of the Social Sciences, New York: Macmillan.
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 201
Value orientations: measurement, antecedents and consequences
201
Appendix 1: 21 items of the ESS human values scale and the values they measure 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Thinking up new ideas and being creative is important to her. She likes to do things in her own original way. Self-direction It is important to her to be rich. She wants to have a lot of money and expensive things. Power She thinks it is important that every person in the world be treated equally. She believes everyone should have equal opportunities in life. Universalism It's very important to her to show her abilities. She wants people to admire what she does. Achievement It is important to her to live in secure surroundings. She avoids anything that might endanger her safety. Security She likes surprises and is always looking for new things to do. She thinks it is important to do lots of different things in life. Stimulation She believes that people should do what they're told. She thinks people should follow rules at all times, even when no one is watching. Conformity It is important to her to listen to people who are different from her. Even when she disagrees with them, she still wants to understand them. Universalism It is important to her to be humble and modest. She tries not to draw attention to herself. Tradition Having a good time is important to her. She likes to "spoil" herself. Hedonism It is important to her to make her own decisions about what she does. She likes to be free and not depend on others. Self-direction It's very important to her to help the people around her. She wants to care for their well-being. Benevolence Being very successful is important to her. She hopes people will recognise her achievements. Achievement It is important to her that the government insure her safety against all threats. She wants the state to be strong so it can defend its citizens. Security She looks for adventures and likes to take risks. She wants to have an exciting life. Stimulation It is important to her always to behave properly. She wants to avoid doing anything people would say is wrong. Conformity It is important to her to be in charge and tell others what to do. She wants people to do what she says. Power It is important to her to be loyal to her friends. She wants to devote herself to people close to her. Benevolence She strongly believes that people should care for nature. Looking after the environment is important to her. Universalism Tradition is important to her. She tries to follow the customs handed down by her religion or her family. Tradition She seeks every chance she can to have fun. It is important to her to do things that give her pleasure. Hedonism
Jowell-Chapter-09.qxd
Appendix 2: Value profiles: Mean importance scores of values in 20 countries Sec
Trad
Conf
Hed
Ach
Sti
Pow
Cons
Op
Se
St
.75 .76 .77 .60 .78 1.01 .68 .68 .69 .67 .47 .44 .62 .52 .65 .76 .51 .70 .77 .31
.62 .59 .76 .67 .66 .52 .61 .77 .81 .53 .45 .47 .59 .33 .56 .59 .54 .48 .67 .50
.54 .38 .70 .52 .62 .69 .37 .45 .50 .51 .15 .33 .44 .39 .53 .46 .19 .25 .53 .34
.28 .20 .16 .72 .44 −.10 .57 .53 .36 .44 .66 .75 .60 .48 .21 .13 .76 .53 .03 .44
−.30 −.01 −.13 .14 −.09 −.36 .18 −.20 −.02 −.12 .35 .08 .14 −.21 −.21 −.10 .28 .25 −.04 .17
−.48 −.20 −.67 .36 −.39 −.17 −.07 .07 −.29 −.14 .04 −.28 −.12 −.36 −.01 .25 .39 −.32 −.25 −.23
−.18 .18 .22 −.63 −.14 .22 −.38 −.27 .13 −.36 −.38 .03 −.57 −.11 .04 −.37 −.97 −.24 .04 −.21
−.18 −.54 −.55 −.85 −.50 −.57 −.61 −.73 −.85 −.36 −.44 −.27 −.41 .22 −.51 −.56 −.42 −.24 −.58 −.17
−.67 −.64 −.64 −1.03 −.82 −.54 −.84 −.46 −.55 −.45 −.87 −.88 −.66 −.75 −.55 −.58 −.76 −.79 −.59 −.54
−.70 −1.04 −1.01 −.83 −.91 −.96 −.81 −1.22 −1.19 −.98 −.65 −.91 −.92 −.70 −.99 −.89 −.78 −.88 −.91 −.87
−.16 −.01 −.21 .41 −.01 −.21 .23 .13 .01 .06 .35 .18 .20 −.03 00 .09 .48 .16 −.08 .13
−.10 −.02 .09 −.38 −.11 .12 −.28 −.09 .03 −.10 −.37 −.17 −.26 −.15 .01 −.16 −.52 −.26 −.01 −.13
−.44 −.79 −.78 −.84 −.70 −.76 −.71 −.97 −1.02 −.67 −.55 −.59 −.67 −.24 −.75 −.72 −.6 −.56 −.75 −.52
.67 .66 .76 .64 .71 .71 .64 .74 .76 .58 .46 .46 .60 .40 .59 .66 .53 .56 .71 .43
Notes: Column labels: Ben = Benevolence, Univ = Universalism, Sdir = Self-direction, Sec = Security, Trad = Tradition, Conf = Conformity, Hed = Hedonism, Ach = Achievement, Sti = Stimulation, Pow = Power, Cons = Conservation, OP = Openness, Se = Self-enhancement, St = Self-transcendence. Row labels: AT = Austria, BE = Belgium, CH = Switzerland, CZ = Czech Republic, DE = Germany, DK = Denmark, ES = Spain, FI = Finland, FR = France, GB = United Kingdom, HU = Hungary, IE = Ireland, IL = Israel, NL = Netherlands, NO = Norway, PL = Poland, PT = Portugal, SE = Sweden and SL = Slovenia
Page 202
Sdir
6:44 PM
Univ
3/9/2007
AT BE CH CZ DE DK ES FI FR GB GR HU IE IL NL NO PL PT SE SI
Ben
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 203
Value orientations: measurement, antecedents and consequences
203
Appendix 3: Standardised regression coefficients for predicting attitudes and behaviour within and between countries from hierarchical linear modelling analysis Opposition to ‘other’ immigrants Individual level Education years Age Income Native born Religiosity Not unemployed Gender male Political efficacy Universalism values Benevolence values Conformity values Security values Hedonism values Stimulation values Self-direction values Polit effic X univ val X stim val X sdir val R2 Individual level Country level Egalitarianism values Embeddedness values Intellectual autonomy values Human development index % in big cities Life expectancy R2 Country level
−.16 .10
InterperMembership in sonal Social voluntary Political trust involvement organisations activism
.11
.07 −.16
.05
.21 .10 .12
.15
.07 .04 .07
.08 .07 .07
−.20 .05 .12
.07 .04
.09
.04 .04
−.04
−.07 .16 .13
.15
.05
.14
−.50
.38 −.46
[.50]a
.48
.30 .12
.05 .03
.12
.04 .05 .05 .03 .03 .20
−.36
[.50]a
.26
.29 .48
.49
.66
.75
.31 .62
.28 .43
Note: All coefficients are significant at p <.001 for the individual level and at p < .01 for the country level. The analyses included data from 20 countries for interpersonal trust, social involvement and political activism, from 18 countries for organisational membership (this variable was not measured in Switzerland and the Czech Republic), and from the 15 West European countries for opposition to ‘other’ immigrants. a The betas shown for both these variables are based on an HLM model including one and not the other. When included alone, both are significant (p < .001) and explain the same amount of variance. However, neither is significant if they are included together (egalitarianism p = .015, HDI p = .222)
Jowell-Chapter-09.qxd
3/9/2007
6:44 PM
Page 204
Jowell-Chapter-10.qxd
3/9/2007
10
8:18 PM
Page 205
Patterns of political and social participation in Europe Kenneth Newton and José Ramón Montero1∗
Introduction Participation is vital to social and political life. Democracy is sustained by citizen participation in elections and political campaigns, by joining parties and pressure groups, by voting in elections and signing petitions, and by reading the papers and discussing politics with friends and colleagues. We help to create healthy communities and build a happy life for ourselves by supporting local associations, giving money to good causes, meeting friends and neighbours, engaging in local and civic affairs, looking after the neighbour’s cat while they are on holiday, and by joining car pools, baby-sitting circles and book clubs. Of course, we take such things for granted because they are part of the daily round, but without participation life would be arid, isolated and alienated. Indeed, without participation society would scarcely exist, and there would be no democracy. What determines the amount and type of participation in society? How do societies differ in the modes of participation they favour? Does the welfare state drive out private and voluntary helping behaviour? Have the small local *
Kenneth Newton is Professor of Comparative Politics at the University of Southampton and Visiting Fellow at the Wissenschaftszentrum Berlin. José Ramón Montero is a Professor of Political Science at the Universidad Autónoma de Madrid and the Instituto Juan March, Madrid. 1 Ken Newton would like to thank Ebrahim Khodaie-Biramy for his enormous assistance with computing, and the Nuffield Foundation for the research grant that financed the work for this chapter.
Jowell-Chapter-10.qxd
206
3/9/2007
8:18 PM
Page 206
MEASURING ATTITUDES CROSS-NATIONALLY
communities of rural society, where people know each other and help each other, given way to modern urban-industrial cities that are populated by strangers? Are we now all so busy that we have less and less time and energy for neighbourly activities, voluntary associations, mutual help and political causes? If so, then the prospects for healthy democracy and a balanced and integrated society are poor indeed. In this chapter, we examine patterns of participation in the 22 countries covered by the European Social Survey (ESS) in its first round. Besides the more obvious aspects of democratic participation, such as voting, party membership and campaign activity, we also cover more personal and low-key political activities, such as being interested in politics and discussing them. We will not limit our study to politics, however. After all, political participation does not stand alone; it is part and parcel of wider social and economic life. Therefore we enlarge the focus to cover social as well as political participation – informal social involvement, helping others and membership of voluntary associations. We will also change the focus of previous studies by examining how levels of participation vary among countries in Europe. A great deal of research has been done on individual participation – it takes the individual as its unit of analysis – but here we compare whole countries to see how their overall patterns of participation compare. For example, we might be interested in what sorts of individual vote in elections, but we might also be interested in the election turnout of countries as a whole. There have, of course, been many studies of political and social participation before ours, and it is useful to draw some general lessons from them before starting on our own analysis. Individual participation: fragmented and multidimensional It used to be thought that the more politically involved the citizen, the more likely they were to accumulate different political activities. At one extreme a large minority of the population was thought to be largely inactive politically, apart from voting every now and again. At the other extreme, the most active – the political class – would be involved in many forms of politics from voting, active party membership and election campaigning, to giving time and money to political organisations, joining civic associations and holding public office. In between, the more politically active the citizen, the more forms of political action they would engage in. In this sense political activity was thought to be cumulative. The most active would be rather like the talented and versatile members of a small musical group who might play half a dozen or more instruments. In their classic study of political participation in America, Verba and Nie (1972) found this perfectly common-sense expectation to be wrong. Most citizens do not accumulate different forms of political activity as they become more involved. They generally specialise in one form of political activity and stick to
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 207
Patterns of political and social participation in Europe
207
it. They vote, or they are party activists, or they get involved in community affairs, or they contact politicians on matters of interest to them, but they do not often do two or more of these things. Only a small percentage of political activists combine a variety of different kinds of involvement. In this sense, citizens do not participate in politics like versatile musicians playing many instruments: they participate in politics as members of an orchestra who specialise in their own instrument. The overall political effect is produced by their combined but separate efforts. In their study of political participation, Verba and Nie found that a few individuals (4 per cent of the US adult population) do little but contact political leaders about particular matters that interest them. Another section of the population (20 per cent) specialises in community activity by combining with other individuals or local organisations. A third group (15 per cent) are campaign activists who join political clubs and parties and volunteer to work for them by canvassing, attending political meetings and contributing to funds. A fourth group (21 per cent) does little but vote, a fifth (22 per cent) are inactive, taking almost no part in political life, while a sixth group (11 per cent) is that minority of complete activists who cover a range of different types of political participation. Apart from in the sixth group, there was little overlap between the different forms of activity. The specialised nature of political participation is confirmed in a followup study carried out in Austria, India, Japan, the Netherlands, Nigeria, the United States and Yugoslavia (Verba, Nie and Kim, 1978). It was also repeated in other studies, of political participation in Costa Rica, Canada, Norway, Tokyo, Britain, and the United States (pp.331–39). All this research found that individuals generally specialise in their chosen type of participation and, moreover, that the forms of specialisation are often much the same from one country to another. For example, research in Britain found six independent modes of political participation along much the same lines as the other studies – voting, party campaigning, collective action, contacting, direct action and political violence (Parry et al., 1992, p. 58). Does this pattern of specialised political participation apply to Europe, as revealed by the ESS? We cannot replicate the original work done by Verba and Nie exactly, because the ESS does not ask identical questions, but it does ask a set of 11 questions about different kinds of political participation that are similar to those asked in previous work. We can use the same statistical method as the original work by Verba and Nie. Known as principal component analysis (PCA), this technique has three great advantages for the present research. First, it can be a quick and efficient way of reducing a mass of incomprehensible statistics to a few understandable ones. Second, it is a method of seeing whether there is a pattern or latent structure in a great mass of data; it helps us to see the shape of a forest made up of a large number and variety of trees. And third, it helps us to identify types of attitude
Jowell-Chapter-10.qxd
208
3/9/2007
8:18 PM
Page 208
MEASURING ATTITUDES CROSS-NATIONALLY
and behaviour. By showing how different aspects of participation may cluster together, principal component analysis helps us to understand the nature of different kinds of participation. For example, if we find that voting, party membership and campaign activity form one cluster, and involvement in demonstrations, political strikes and sit-ins form another, we get the sense that conventional politics and protest politics are separate, and can then set about trying to find out who is engaged in what sorts of activity and why. If principal component analysis reveals a single principal component, or extracts one that accounts for a large proportion of the variance, we can conclude that political participation tends to be all of a piece (unidimensional and cumulative) because high participation on any one measure is likely to be accompanied by high participation on the other measures. If, however, principal component analysis finds not one large or single component, but several of them, each covering a small cluster of measures of participation, then we conclude that there is no single underlying dimension to participation, but a number of different and distinct types. Most studies of individuals show that citizens usually specialise in a particular form of political participation, avoiding the others. The results of a principal component analysis of the eleven ESS measures of political participation in the UK (see Appendix 1 at the end of this chapter) reveals four components, each of them quite small, and each separate from and independent of the others.2 The first, explaining (in a statistical sense) only 17 per cent of the variance in levels of participation in the 11 kinds of political activity, is associated with working for a political organisation, displaying campaign material, taking part in demonstrations and donating money. The second component explains just less than 16 per cent of the variance in levels of political participation and includes signing petitions, boycotting products, and buying ethical products. The third component explains 15 per cent of the variance and covers interest in and discussion of politics. And the smallest component of all (explaining nine per cent of the variance) is associated only with voting. These results are entirely consistent with previous research showing that individual political participation among the citizens of any given country is normally fragmented into a number of different, specialised and independent activities. The few cross-national studies of participation that have been carried out all show the same picture of fragmentation, and in one seven-nation study Verba, Nie and Kim (1978) find that each nation has its own special and unique profile of political participation. Thus, voting in Austria is high, but communal activity is quite low. In the United States, voting and particularised contacting is low, but 2 In fact the four principal components in the table in Appendix 1 explain only a little more than a half of the total variance in political participation, which means that there is likely to be a variety of other components, not shown for lack of space. The patterning of individual political participation is indeed weak.
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 209
Patterns of political and social participation in Europe
209
communal activity is high. In the Netherlands, particularised contacting is high, voting quite high, but campaign activity is low. The countries do not show the same level of activity across a number of kinds of participation, but rather vary in their own special ways. Each has its own profile. National levels of participation: also fragmented and multi-dimensional? For the purposes of cross-national comparative work, the ESS goes to great lengths to ensure that the surveys in each country are as comparable as possible; there was a sufficient number of nations (22) in Round 1 to draw conclusions about cross-national differences; and the countries themselves are varied, covering much of Europe (said to be God’s natural laboratory of nations) from the rich and democratically stable ones of the north, to the poorer and newer democracies of the Mediterranean and Central Europe. National and individual patterns of political participation are not necessarily the same. For example, one of the paradoxes of voting studies is that in most countries the better educated the population the higher the voting turnout. Yet two of the most highly educated countries in the world, the United States and Switzerland, have rather low voting turnouts. This is because voting turnout is the product of a combination of individual factors (education, gender, age, income, political interest) and national factors, of which political institutions, the voting registration system and the party system are among the more important. In the USA the better educated are more likely to vote, and the USA also has a high level of educational attainment, but nonetheless, turnout is low because of the party, electoral and registration systems. It should not be assumed that patterns at the individual level will necessarily or invariably be repeated at the national level, although they may be. It is quite likely that, like individuals, countries have their own special patterns of political participation. Since each country has its own, unique combination of historical, cultural, institutional and political characteristics, and since each has its particular mix of social and economic circumstances, it would not be at all surprising if each had its own particular and perhaps unique profile of participation. This is what the Verba, Nie and Kim (1978) study of seven nations suggests. It is, unfortunately, a characteristic of social surveys that they often confirm what we all know perfectly well already from ‘common sense’ and everyday experience. Surveys are useful because they provide us with hard facts, but we cannot expect them to overturn what we already know only too well about the world, because we live in it every day of our lives. In other words, we should not be surprised or disappointed if the ESS confirms the common-sense expectation that each country, like each individual, has its own specialised pattern of participation.
Jowell-Chapter-10.qxd
210
3/9/2007
8:18 PM
Page 210
MEASURING ATTITUDES CROSS-NATIONALLY
Types of participation The ESS survey asks a series of 25 questions about many kinds of participation. The questions, listed in Appendix 2 at the end of this chapter, can broadly be categorised into five categories, as follows: 1. 2. 3. 4. 5.
Participation in voluntary associations Meeting socially Helping behaviour Conventional political participation Political protest behaviour.
In the next part of the chapter we will look in turn at each of these kinds of participation in each nation, to see if countries are like individuals, with each showing its own specialised pattern of social and political involvement. Participation in voluntary associations Voluntary associations and clubs are said to be crucially important for both social and political life. Through them, individuals gain a sense of social belonging and identity in their community. By creating overlapping and interlocking networks of people and organisations, they help to bind society together and create trust, co-operation and a common purpose among citizens. They are important politically because they establish organised social life outside the power of the state while, at the same time, providing ready-made social organisations that may also, when the need arises, be mobilised politically. Many associations carefully avoid politics, but nonetheless have a political significance because they help to integrate and stabilise society, and help to build the social foundations of effective and efficient government. As noted, they may also become politically active if required. Football fan clubs are not political – unless or until governments affect their interest by cracking down on football hooliganism, influencing the transfer market (which happened at an EU level), subsidising or taxing football in some way, or helping/hindering national football associations to attract major tournaments to their country. Voluntary activists are also a pool of people who can be recruited into political and community life. In general it might be thought that each country would have its own favoured set of voluntary associations. Religious countries would have strong church and religious organisations, sport-loving countries would have a high sports club membership, socialist nations a large trade union membership. Some countries might be more interested in arts and science clubs, while others would lean towards social clubs. Countries with a young population might have many youth clubs, and so on.
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 211
Patterns of political and social participation in Europe
211
The ESS asks about 12 kinds of voluntary organisation – namely business, consumer, cultural, environmental, humanitarian, political, religious, science, social, sport, trade union, and others. Since voluntary participation takes different forms, the survey also asks respondents to specify whether they are members of, participators in, donors of money to, or voluntarily workers for each kind of association. The ESS therefore provides us with measures of four different sorts of participation in 12 different types of voluntary organisation in 20 countries (Switzerland and the Czech Republic are not included in this part of the study, though they are covered in all later parts).3 Here we simply focus on any type of participation in any of the kinds of voluntary association. Table 10.1 shows the percentage of the population in each country that is involved in any of the four ways in any of the 12 kinds of voluntary association. Since this is a large and rather overwhelming collection of statistics, the table attempts to make them more manageable by organising the statistics in a particular way. First, as explained, it shows a single composite figure for all four kinds of participation (membership, participation, donating money, voluntary work). Second, it has been organised so that associations with the largest number of participants appear in the first column (sports clubs), and associations with the fewest participants appear in the last column. And third, the countries are ranked from top to bottom in terms of their overall rates of participation across all 25 of our social and political measures. The basis for this overall ranking is explained in more detail later; more important at this stage is the overall rank ordering of countries. The figures present a major surprise. Far from showing that each country has a uniquely variable amount of participation in each kind of association, the table seems to present a clear pattern in which a country’s level of participation in any one kind of association is generally repeated in most of the others. At one extreme, Austria and Norway generally have comparatively high figures for all 12 kinds of voluntary association, and they are usually well above the average for all 20 countries. At the other extreme, Poland and Greece almost always have low levels of participation for all 12 voluntary association and are well below the ‘ESS average’. Simply by ‘eyeballing’ the figures in the tables, it is clear that there are large differences in levels of participation between Austria and Norway on the one hand, and Poland and Greece on the other.
3 In Round 1 of the ESS, both Switzerland and the Czech Republic failed to administer this part of the questionnaire as required.
Participation in 12 kinds of voluntary association, by country (percentages)
Country
Type of voluntary organisation Trade Social Unweighted Sport Cultural Consumer Religious union Humanitarian club Environmental Business Science Other Political base 34 33 – 38 18 32 5 11 50 12 29 27 33 6
41 21 – 19 29 32 35 13 9 37 25 11 21 11
24 49 – 56 66 23 49 30 27 24 15 18 17 11
24 37 – 33 25 25 21 21 17 16 21 11 14 15
25 27 – 22 20 12 14 28 23 24 18 19 21 14
30 9 – 14 16 34 8 19 21 14 15 9 17 10
12 16 – 10 16 15 15 13 16 20 9 10 15 4
16 11 – 14 9 14 11 16 9 13 9 15 13 11
18 17 – 13 9 15 13 14 11 13 9 15 11 10
16 10 – 10 7 6 8 9 10 11 6 12 4 3
2257 2036 – 1999 1506 2364 2000 1899 1552 2046 2919 2499 2052 1503
– 19 24 19 13 10 5 6
– 17 14 15 6 9 6 5
– 5 10 8 2 3 1 1
– 11 13 11 12 9 2 7
– 8 22 14 7 7 6 7
– 11 15 15 7 3 2 4
– 11 18 8 7 8 4 3
– 6 3 9 3 2 2 3
– 6 10 13 4 5 5 2
– 11 7 5 4 5 4 4
– 7 8 4 10 4 2 4
– 5 5 6 5 2 4 3
– 1729 1519 1207 1511 1685 2566 2110
Average for all 29 countries
21
16
17
16
15
14
12
9
9
9
5
42359
Notes: Comparable questions were not asked in Switzerland and the Czech Republic. In this table and all following tables the base numbers include 'Don't know', 'Not answered' and 'Refusal'. In this and all following tables, country statistics are weighted by the country-specific design weights, which correct for differential selection probabilities owing to the effects of clustering in different national sample designs. In the last row of this table and all following tables (where applicable), the data are weighted not only by the country-specific design weights, but also by the population weights is that correct for different population sizes across countries
Page 212
29 30 – 30 31 25 26 33 30 28 24 25 25 28
8:18 PM
36 44 – 50 41 56 38 42 34 46 38 28 36 32
3/9/2007
Austria Norway Switzerland Sweden Denmark Netherlands Finland Belgium Luxembourg Ireland Germany Israel UK France Czech Republic Spain Slovenia Italy Portugal Hungary Greece Poland
Jowell-Chapter-10.qxd
Table 10.1
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 213
Patterns of political and social participation in Europe
213
A further look at Table 10.1 shows that the other countries also maintain a generally consistent ranking across the 12 types of association. Spain, Slovenia, Italy, Portugal and Hungary have generally low levels of participation across each type of organisation, and in all but a few cases below the 20-nation average. Sweden, Denmark, the Netherlands and Finland generally have higher levels of participation, and in most cases are well above the average. Luxembourg, Ireland, Germany, Israel, the United Kingdom and France are close to the average. In other words, contrary to our ‘unique country profiles’ hypothesis, the table shows a pattern in which most countries hold their rank order position for participation in most types of voluntary association. The relative level of participation in any one type of organisation is generally repeated in all the others. There are, to be sure, a few notable deviations from this general rule. Again, simply eyeballing the data reveals a number of these deviations. Ireland, for example, is high on religious and business/professional/farmers’ participation, but comparatively low on consumer groups. The Netherlands has high levels of participation in sports, but is low on social clubs. Sweden and Norway are relatively low in respect of religious participation, and Norway in respect of environmental/peace/animal groups, while Belgium has the highest level of participation in social clubs but a low level of participation in consumer groups. Do these exceptions disprove the general rule? Rather than scanning the large number of percentage figures in Table 10.1 to try to get an answer, we can again use the statistical method of principal component analysis to explore underlying relationships in patterns of participation across different countries. The results (Appendix 3 at the end of this chapter) show that variation in patterns of participation cross-nationally can be explained by three principal components – one large component (explaining no less than two-thirds of the variance), and two smaller ones (explaining just nine per cent and six per cent of the variance respectively). This first component loads heavily and positively on all 12 kinds of association, showing that there is a strong, single pattern to all the country figures.4 The result could scarcely be clearer in showing that participation in voluntary organisations in different countries is not fragmented, distinctive, or multi-dimensional. Countries do not have their own unique and variegated participatory profiles so far as voluntary associations are concerned. On the contrary, high involvement in one kind of association in any given country means 4
The figures in Appendix 3 are the factor loading between each principal component and each individual measure on which it is based. Factor loadings vary between a maximum of plus or minus 0.999. The factor loading of 0.929 in the first column shows the first principal component is most closely associated with cultural organisations, while the figure of 0.689 shows that it is most weakly associated with trade unions, although by most standards this figure is also rather strong. The figures in the last row (% of the variance explained) also show that the first component is a strong one, explaining very nearly twothirds of all the statistical variance of participation in the 12 kinds of voluntary association.
Jowell-Chapter-10.qxd
214
3/9/2007
8:18 PM
Page 214
MEASURING ATTITUDES CROSS-NATIONALLY
that there is generally high involvement in all the other kinds of association in that country. The same pattern holds for low participation. Although we have a diverse range of voluntary associations and an equally diverse array of countries, participation follows a regular pattern, allowing us to rank countries on a single voluntary association scale. Social and helping behaviour Participation in formally organised voluntary associations is one thing; meeting people socially is quite another. Most social life consists of informal meetings and gatherings, either chance or arranged – in cafés and bars, at bridge and book clubs, at sporting events and concerts, around dinner tables, for a drink and a chat after work or for a weekend with friends. The ESS asks how often people meet socially with friends, relatives and work colleagues. Socialising of this sort is said to suffer as a result of the pressures of modern life. Big cities are commonly supposed to be lonely places inhabited by strangers, while small-scale rural society is supposed to encourage roots in the local community. On the other hand, there may be other kinds of national influences on social patterns of this kind; social life in relatively cold and wintry Northern Europe might take the form of arranged and organised meetings in warm community centres and village halls, while the same sorts of people in Mediterranean countries with their longer, balmy evenings might stroll to a bar, café or town square to meet with friends. In this sense social clubs and meetings take many national forms, from the ‘passagio’ of southern countries, to the Stammtisch in Germany and Austria, to the local pub in the UK. In general, one might expect that the more that social life is formally organised by voluntary associations (such as in cold Northern Europe) so the less informal socialising there will be, and the more that social meetings are informal (as in warmer Southern Europe), so the smaller role will there be for formal voluntary associations in social life. The same sort of inverse relationship might well apply to informal helping behaviour. Society could barely exist if people did not help each other in many different ways – doing a bit of shopping, visiting the ill, looking after the cat or the house plants, lending a bit of money, doing household repairs, nursing the sick. Once again, one might expect national patterns to vary. Large-scale, urban-industrial society is often said to create an impersonal world in which people tend not to help each other, relying on formally organised charities, welfare groups and church aid. Small-scale, rural societies rely more on the personal support of friends, family and neighbours.
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 215
Patterns of political and social participation in Europe
215
So it may be that the more advanced the welfare state, the less important are formally organised voluntary associations such as charities, mutual aid and humanitarian associations. Similarly, it may be that the more important are formally organised state and private welfare organisations so the less prevalent will be informal helping activity. Both left- and right-wing social theorists argue that the welfare state tends to drive out the voluntary spirit in modern society. Jürgen Habermas, the German Marxist philosopher, went even further, claiming that the welfare state has colonised civil society and undermined natural forms of solidarity (Deflem, 1996). Similarly, Alan Wolfe (1989), an American political scientist, claims that the welfare state undermines the moral strength of both intimate and distant social ties. If this is true, one would expect informal helping behaviour to be inversely related to participation in formally organised humanitarian associations (charities, welfare and aid associations). It also follows that advanced welfare states will tend to be weak civil societies, with lower levels of social solidarity and social participation, especially in the form of formal and informal helping behaviour of a ‘neighbourly’ kind. Once again, all this suggests that countries with high rates of participation in one kind of activity might have low rates in others. The ESS asks about both social life and helping behaviour. The measure of informal social engagement used here is the percentage of respondents saying that they meet socially, and the frequency with which they do so. The measure of helping behaviour is the percentage saying that they have actively provided help for others, and the frequency with which they do so. The first two columns of Table 10.2 compare participation in organised social clubs with the frequency of meeting socially with friends, relatives and work colleagues. The figures are ranked according to countries’ overall rates of participation on all 25 measures, from the highest to the lowest. Once again, our expectations are confounded by the findings. Although as one would expect, meeting socially is much more common than participating in social clubs, the figures in the two columns tend to move down together in harmony as we go from Austria and Norway at the top to Greece and Poland at the bottom. Formal and informal socialising, it seems, are not alternative modes of social participation, but vary in tandem. In between these extremes the other country figures for the two forms of social involvement also tend to decline together – albeit unevenly and irregularly.
Jowell-Chapter-10.qxd
3/9/2007
216
8:18 PM
Page 216
MEASURING ATTITUDES CROSS-NATIONALLY
Table 10.2 Percentages by country of membership of formal social clubs vs. informal social contacts; and of formal vs. informal helping behaviour Social
Helping
Formal social club
Informal social contacts
Formal humanitarian association
Informal helping others
Austria Norway Switzerland Sweden Denmark Netherlands Finland Belgium Luxembourg Ireland Germany Israel UK France Czech Republic Spain Slovenia Italy Portugal Hungary Greece Poland
25 27 26 22 20 12 14 29 23 24 18 19 21 14 15 11 18 8 7 8 4 3
92 98 97 97 98 96 95 93 91 92 94 94 92 95 86 92 87 87 90 69 77 84
24 37 20 33 25 25 21 21 17 16 21 11 14 15 8 11 15 15 7 3 2 3
81 66 85 67 72 76 54 71 58 58 78 68 61 55 40 44 74 44 67 61 55 52
2257 2036 2040 1999 1506 2364 2000 1899 1552 2046 2919 2499 2052 1503 1360 1729 1519 1207 1511 1685 2566 2110
Average of all countries
14
91
15
59
42359
Country
Unweighted base
Notes: As comparable questions about membership of voluntary organisations were not asked in Switzerland and the Czech Republic, the figures in the second and fourth columns for these countries have been imputed by regression analysis (see Little and Rubin, 2003)
As the last two columns in Table 10.2 show, activity in formally organised welfare, aid and humanitarian associations is far less common than informal helping behaviour but, as before, the two sets of figures tend to decline together as we move from Austria and Norway at the top, to Greece and Poland at the bottom. Charities and aid organisations, it seems, do not drive out giving personal help to others. On the contrary the two forms of activity seem to reinforce each other; so the more formal and organised help there is, the more informal and personal help there is too. Nor does the welfare state drive out either informal social ties or informal helping activity. On the contrary the advanced welfare states of Northern Europe (Norway, Sweden, Denmark, the Netherlands and Finland) generally have higher rates of social engagement and helping activity than the less well-developed welfare states of Poland, Greece, Hungary and Portugal.
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 217
Patterns of political and social participation in Europe
217
The figures in Appendix 4 at the end of this chapter confirm the strong positive association between the four kinds of social and helping behaviour. Principal component analysis produces a large first component that loads heavily and positively on all four measures and explains 69 per cent of the variance. Formal and informal social and helping activity do not follow their own separate paths, drive each other out, or serve as functional equivalents. Instead, they go hand in hand. Nor does the welfare state seem to undermine social ties or helping behaviour, whether informal or organised through voluntary associations. On the contrary, the welfare state seems to encourage them both. Conventional political participation Conventional political activity is the most widespread form of involvement in democratic politics. Our measures of conventional political participation are the percentages of respondents saying they had engaged during the past 12 months in each of the following activities: voting; involvement in political campaigns; contacting political and government officials; joining, working for, or giving money to political organisations; being interested in politics; and talking about politics. Table 10.3 does not conform to the well-established finding that political participation at the individual level is fragmented into different and separate kinds of activity, or that each country has its own particular and unique profile. On the contrary, the figures generally follow the pattern of the two previous percentage tables. Countries that are high in participation in voluntary associations and social and helping behaviour also tend to be high in the seven forms of conventional political engagement. There are, however, some conspicuous exceptions to this general rule – such as Austria in respect of displaying campaign material, Germany and Israel in respect of political interest, and Italy in respect of voting – but overall the same consistencies can be observed. This is further illustrated in Appendix 5 at the end of this chapter, which shows the results of a principle component analysis of different forms of political activity at the country level. But whereas Appendix 1 shows that individual political participation is usually split between four small and separate components, a comparison of country level participation reveals a quite large and strong first component, and two smaller ones. The first component, accounting for 42 per cent of the total variance, covers six of the seven measures of conventional political participation – donating money to a political organisation, doing political work, displaying campaign material, contacting politicians, interest in politics and talking about politics. These political activities are not distinct and separate activities, but different aspects of the same thing.
Jowell-Chapter-10.qxd
Table 10.3
Participation in seven kinds of conventional political activities, by country (percentages) Type of political activity Donated money
8 22 9 11 5 4 16 7 5 9 6 12 20 11 5
18 23 17 16 18 14 24 18 18 22 13 13 18 18 23
88 84 69 87 94 86 82 85 65 76 85 79 72 75 66
11 12 18 6 9 8 7 9 15 10 9 12 8 3 12
10 2 7 7 3 3 3
12 12 12 12 15 15 10
78 80 89 73 81 91 66
7
14
79
Worked for political party or group
Interested in politics
Discussed politics
Unweighted base
59 9 8 5 4 3 3 5 4 5 4 6 3 5 5
58 49 61 57 63 66 46 45 43 47 63 63 52 40 32
77 78 85 70 77 71 77 60 69 60 82 75 62 74 66
2257 2036 2040 1999 1506 2364 2000 1899 1552 2046 2919 2499 2052 1503 1360
5 6 3 4 2 2 9
6 4 3 4 3 5 3
21 42 33 36 46 32 40
53 60 52 65 69 50 68
1729 1519 1207 1511 1685 2566 2110
6
4
45
66
42359
Page 218
Average for all countries
Voted in last election
8:18 PM
Austria Norway Switzerland Sweden Denmark Netherlands Finland Belgium Luxembourg Ireland Germany Israel UK France Czech Republic Spain Slovenia Italy Portugal Hungary Greece Poland
Contacted politician
3/9/2007
Country
Displayed campaign material
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 219
Patterns of political and social participation in Europe
219
There are, however, two smaller components of conventional political participation. One of these accounts for 19 per cent of the variance and loads very heavily on voting alone. It is not surprising that voting sticks out on its own in this way because many of those who vote are not politically engaged in any other way. Voting is thus not a good predictor of other forms of conventional political activity because 60–80 per cent of the population do it. The same sort of explanation applies to the second largest component that loads heavily on being interested in and discussing politics. Both are common and relatively costless forms of activity and do not predict more active and committed forms of activity such as giving money or working for a political organisation. Protest politics We refer to protest behaviour in this context as taking part in any one of four forms of political activity: lawful demonstrations, signing a petition, boycotting products; and deliberately buying certain products for political, ethical or environmental reasons. While all these are legal and democratic, they differ from conventional political activity in that they are direct forms of political engagement, often outside the usual institutional channels involving government, political parties and pressure groups. In the past 50 years or so protest behaviour has become part of the normal repertoire of politics, though relatively few engage in it in western states (Barnes et al., 1979). Our measures of political protest activity are the percentages of people who say they have been involved in any of these activities in the past 12 months. Table 10.4 shows how different countries engage in these activities. Once again we find much the same countries as before with the highest and the lowest rates of participation: Austria, Norway, Switzerland, Sweden and Denmark are well above average on three of the four measures, while Poland, Greece, Hungary and Portugal are well below, and generally have the lowest figures. The exception to the rule is lawful demonstrations, which does not fit the general pattern and stands out on its own as a different kind of activity, perhaps because it is the most active and radical form of unconventional behaviour and one that involves the smallest number of people in most countries. This is also illustrated in the principal component analysis in Appendix 6 at the end of this chapter, which shows that three of the four types of unconventional behaviour are very closely aligned (loading highly and positively on the first component, and explaining 70 per cent of the variance, while the fourth type – lawful demonstration – stands out as a dimension of its own (accounting for a further 24 per cent of the variance). The rarest form of protest activity is illegal protest. Although the ESS asks about it, it was decided not to include it here as a form of participation,
Jowell-Chapter-10.qxd
3/9/2007
220
8:18 PM
Page 220
MEASURING ATTITUDES CROSS-NATIONALLY
Table 10.4 Participation in four kinds of political protest activities in past 12 months, by country (percentages) Signed petition
Buy ethical products
Boycott products
Austria Norway Switzerland Sweden Denmark Netherlands Finland Belgium Luxembourg Ireland Germany Israel UK France Czech Republic Spain Slovenia Italy Portugal Hungary Greece Poland
27 36 39 41 28 22 24 34 29 28 30 18 40 35 16 24 12 17 7 4 5 7
30 35 45 55 44 26 42 27 30 25 39 18 32 28 23 12 10 7 7 10 7 10
22 19 31 32 23 10 27 13 16 14 26 15 26 27 11 8 5 8 4 5 9 4
8 10 8 6 3 21 7 8 2 4 8 10 11 5 18 11 3 17 4 4 5 1
2040 2036 1506 2257 1999 2364 1899 1552 2046 2919 2000 2052 2499 1503 1360 1519 1729 1207 1511 1685 2110 2566
Average of all countries
26
24
17
9
42359
Country
Lawful Unweighted demonstration base
because it covers such a small percentage of the population (2 per cent or less) in most countries, and also because it is so different in nature from the other measures. However, it is worth noting that, like lawful demonstration, illegal protest also stands out in our analysis as a completely different form of participation from all the others. Another pattern is evident in Table 10.4. It might be thought that conventional forms of political participation would be relatively less widespread in wellestablished democracies (where the responsiveness and accountability of their political systems make protest politics less necessary), and more widespread in the newer and less well-developed democracies of Central and Southern Europe. But an examination of Table 10.4 shows that this is not the case. Austria, Norway, Switzerland, Sweden and Denmark have comparatively high rates of protest activity on three measures (not on lawful demonstrating, where they are average or low), while Poland, Greece, Hungary and Portugal are far below average in all four columns. As some theories predict (e.g. Inglehart
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 221
Patterns of political and social participation in Europe
221
1997, pp.312–15; Barnes et al. 1979), the circumstances of rich and advanced democracies seem to allow, even encourage, unconventional and lawful forms of protest activity, while the newer and less well-established democracies tend to discourage them. Overall participation The acid test of whether our analysis of participation in 22 countries has a single underlying and unifying structure is to put all our measures together and analyse them as a whole. This makes stringent demands on the ESS data because they cover not only five different forms of social and political participation but also a set of 25 measures in 22 countries. Most studies of individual political participation find little in common among a small number of measures of conventional political participation in only one country, never mind a larger and much more varied number of measures of participation in many of them. The measure of overall participation is more complicated than its component parts, as reported so far in percentages in Tables 10.1–10.4. We cannot simply add our percentage figures together and divide by the total number of measures of participation, because some of the figures are typically rather high (voting turnout) and others much smaller (participating in demonstrations). A 5–10 per cent gap between voting turnout of 75 per cent, 80 per cent and 85 per cent is not greatly different, but a 2–3 per cent difference in protest activities of 3 per cent and 4 per cent is proportionately more noteworthy. To calculate simple averages would thus give disproportionate weight to voting, and too little importance to small but significant differences in protest activity. Therefore, in order to make our country scores for different kinds of activity directly comparable with one another, they have been standardised by subtracting each from the mean of all 22 countries, and then dividing by the standard deviation of the measure. This makes the score for each country directly comparable with all the others. Table 10.5 shows the summary scores for all 25 measures of participation. Recall that it is this summary score for all kinds of participation that has determined the rank order of the countries in the previous percentage tables. Thus, Austria, Norway, Switzerland, Sweden, Denmark and the Netherlands are well above the average score; Portugal, Greece, Hungary and Poland are well below it; while the other countries fall somewhere in between. Principal component analysis (Appendix 7 at the end of this chapter) yields a first, large component that explains 51 per cent of the variance and loads heavily and positively on no fewer than 23 of our 25 measures of participation. This is strong evidence indeed that there is a single underlying
Jowell-Chapter-10.qxd
3/9/2007
222
8:18 PM
Page 222
MEASURING ATTITUDES CROSS-NATIONALLY
structure or backbone that links different forms of social and political participation at the country level. High participation on any one of the 23 measures in a given country is highly likely to be accompanied by high participation on any one of the other measures. So far as countries are concerned, participation is not fragmented into discrete parts; it is all of a piece. Participation is participation is participation. Table 10.5
Overall participation scores in 22 countries
Country Austria Norway Switzerland Sweden Denmark Netherlands Finland Belgium Luxembourg Ireland Germany Israel UK France Czech Republic Spain Slovenia Italy Portugal Hungary Greece Poland
Total participation scores
Unweighted base
1.13 1.13 0.92 0.92 0.76 0.56 0.55 0.55 0.53 0.51 0.48 0.41 0.34 0.11 −0.25 −0.38 −0.39 −0.52 −0.72 −0.90 −0.98 −1.02
2257 2036 2040 1999 1506 2364 2000 1899 1552 2046 2919 2499 2052 1503 1360 1729 1519 1207 1511 1685 2566 2110
Note: Figures in this table are the average standardised scores for 25 measures of social and political participation
As before, there are, however, a few exceptions to this general rule. Taking part in illegal demonstrations is so completely different from our 25 legal kinds of participation, that we have not even included it in the analysis here. In addition, as noted, voting and taking part in lawful demonstrations are distinct forms of participation separate from the others. Voting is the only kind of political involvement that covers a majority of the population and is, therefore, a poor predictor of other kinds of conventional political activity. Similarly, and at the other extreme, lawful demonstrating is qualitatively and quantitatively different from all the other forms, and is also a poor predictor of general participation.
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 223
Patterns of political and social participation in Europe
223
What explains the national patterns? What explains the patterns of participation we have found in the 22 countries? Why do some countries show high rates of involvement across almost all the measures, while others have consistently low figures? Perhaps the first thing to note about Table 10.5 is that countries with a strong resemblance tend to cluster together. They are often similar sorts of country with either common borders or strong historical and cultural connections. Norway, Sweden, Denmark, Finland and the Netherlands are all high participation countries at the top of the table, together with Austria and Switzerland. At the bottom are a cluster of a very different group of similar countries, namely the ex-communist countries of Poland, Hungary, Slovenia and the Czech Republic. The Mediterranean countries of Greece, Portugal, Italy and Spain also have consistently low rates of participation. In between, and clustered around the average on most measures, is another group of neighbouring countries – Ireland and the United Kingdom, Belgium and Luxembourg, Germany and France. Granted that these families of countries have rather similar levels of participation, but what more exactly determines these levels? Perhaps it is culture, or history, or religion, or national wealth, or government and politics, or the legal system, or maybe the length of time a country has been a stable democracy, or their particular brand of public policies? It is notable, for example, that the older and richer democracies are at the top of Table 10.5, while the newer and poorer ones are generally at the bottom. We can pin down the co-variates of levels of participation at the country level more by looking at correlations between the participation scores reported in Table 10.5 and a set of measures describing the social, economic and political characteristics of the 22 countries. From this, we can get an idea of what sorts of country characteristics are most closely associated with participation rates. For the purposes of this analysis, we compiled a set of indicators describing the social, economic and political characteristics of the 22 countries. Some of these are simple measures consisting of a single estimate (gross national product per capita, for example), while others combine more than one measure in order to capture more subtle and complex circumstances (rule of law, for example). The variables we included in our analysis are as follows (with their source shown in parentheses): • National wealth and income inequality (Gini index). • Political rights and civil rights, and the two combined to form a single democracy score (Freedom House ratings). The higher the score the more democratic the country.
Jowell-Chapter-10.qxd
224
3/9/2007
8:18 PM
Page 224
MEASURING ATTITUDES CROSS-NATIONALLY
• Political stability and lack of violence is a composite measure provided by the World Bank and covers ethnic tension, internal conflict, constitutional changes, military coups, political fragmentation of parties and groups, social unrest, terrorist threats and armed conflict. The scale is fixed with a maximum of + 2.5 for the greatest government stability, and a minimum of – 2.5. • The rule of law is a composite measure provided by the World Bank covering the extent of black markets, enforceability of private and government contracts, corruption in banking, crime and theft as obstacles to business, losses from and costs of crime, and the unpredictability of the judiciary. The scale is fixed with a maximum of + 2.5, the greatest rule of law, and a minimum of – 2.5 for the least. • Government effectiveness is a composite measure provided by the World Bank covering bureaucratic quality, transaction costs, quality of public health care and government stability. The scale is fixed with a maximum of + 2.5, showing the greatest government efficiency, and a minimum of – 2.5, the least effective government. • Corruption (Transparency International). • Religious, linguistic and ethnic fractionalism. • Government expenditure – on health and education shown as a percentage of GDP (UN Human Development Report). • Social and economic indicators, including population size and density, degree of urbanisation, size of agricultural sector, life expectancy, educational attainments of the population. A full account of these measures and how they are produced is presented in Delhey and Newton, 2005. The correlations between this list of variables and country rates of overall participation are shown in Table 10.6. The figures show that rates of participation are most closely associated with three groups of country characteristics. First and foremost is the nature of the social and political order, as measured by the law and order index (provided by the World Bank), an absence of political corruption (Transparency International), and above all a very close association with the rule of law (World Bank). The second group of characteristics covers democratic effectiveness and stability, as measured by the extent to which the country has fully developed democratic government (Freedom House) and by the World Bank’s composite measure of government effectiveness. And the third group of variables covers national wealth and its consequences, especially GDP per capita and life expectancy, both of which tend to be highest in the wealthiest nations.
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 225
Patterns of political and social participation in Europe
225
Table 10.6 Pearson correlations between participation scores and country characteristics Social and political order Rule of law Absence of political corruption Law and order
.90** .84** .74**
Democratic effectiveness and stability Government effectiveness Freedom House democracy score Political stability
.79** .65** .39
Wealth GDP per capita Life expectancy Economic equality
.68** .60** .23
Government expenditure Government expenditure on education and health Government expenditure on education Government expenditure on health
.49** .46* .37
Population Foreign-born population Linguistic fractionalisation Education Religious fractionalisation Ethnic fractionalisation Population density Population size
.35 .33 .30 .11 .10 .09 −.21
Other Urbanisation Murder rate Size of agricultural sector
.30 −.31 −.70**
Notes: Pearson correlations. ** significant at 0.01 level, two-tailed test; * significant at 0.05 level, two-tailed test. All the other figures in the table should be set aside on the grounds that they might be explained by chance
The other variables in the list generally have weak and statistically nonsignificant associations with participation, with the exception of the size of a nation’s agricultural sector, which is associated with lower rates of participation. This suggests that, contrary to some suggestions, participation does not decline in large-scale, urban-industrial societies but, on the contrary, tends to be lowest in agricultural societies. However, the size of the agricultural sector is also closely associated with national wealth, and it may be that it is not rural and farming lifestyles, but economic development that matters most. How can we be sure which correlations are the really important ones when we have 10 or 12 different measures closely associated with participation? Is,
Jowell-Chapter-10.qxd
226
3/9/2007
8:18 PM
Page 226
MEASURING ATTITUDES CROSS-NATIONALLY
for example, the rule of law the most important predictor of participation, or is national wealth (which happens to be related to the rule of law) what really matters? If we control for national wealth, is it actually the best-educated populations (usually found in wealthy countries) that participate most? We can untangle some of these complicated relations by means of another statistical technique known as regression analysis, which allows us to estimate the strength of the statistical association between participation and any one characteristic of a country, while holding all other characteristics constant. So we can estimate, for instance, how much of the variance in country-level participation is explained by national wealth while holding constant the effect of other factors such as education, religion, type of government and democratic stability.5 The model fitted suggests that the following variables are most consistently and strongly associated with high rates of participation – that is, they are the smallest number of variables that explain the greatest amount of variation in participation across the 22 countries (Table 10.7). So, participation is most strongly and consistently associated with only a few of our long and complicated list of variables, namely, the rule of law, national wealth (as measured by GDP), income equality, law and order, government effectiveness and political stability. Of these, the rule of law is by far the most dominant single factor. This means that countries with a small black economy, little business corruption, a low crime rate, a predictable (rule bound and impersonal) judiciary, and enforceable government and private contracts tend to have the highest rates of social and political participation. It is not so much that any of these features on its own directly encourages high levels of participation, but that all of them combine to produce a social and political culture and a set of economic, civil and government institutions
5
A technical problem for regression analysis occurs when two or more of the explanatory variables are themselves closely associated - the problem of multicollinearity. For example, rule of law and political corruption are different measures of much the same thing, and therefore closely associated. Putting both these variables in the same regression equation disturbs the results and makes them statistically erratic and unreliable. The solution is to run a series of regressions (or models) that do not include explanatory (independent) variables that are themselves closely associated. If the same variables appear strongly and consistently in different regression models with different combinations of measures, we can be reasonably sure that they are indeed associated with participation in their own right, and not simply because they are riding on the shirt-tails of other variables.
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 227
Patterns of political and social participation in Europe
227
that are conducive to high rates of participation. It may also be that the cause and effect relationship works in the other direction, and that high rates of citizen participation help to produce an effective legal system, a stable democracy and an efficient economy. Table 10.7
Country level predictors of participation
Type of participation
Country-level predictors
Voluntary associations Conventional politics
Rule of law, GDP, income equality Rule of law, government effectiveness, law and order, GDP Rule of law, government effectiveness Rule of law Rule of law, government effectiveness, GDP, political stability
Protest politics Social and helping behaviour Overall participation
Conclusion We started our analysis with the expectation that each of our 22 countries would have a different profile of participation. This expectation was based on the idea that since each country has its own special (perhaps unique) culture, history, political and legal system, and social, economic and political circumstances, we could also expect it to have its own particular pattern of participation. Extensive research at the individual level also shows that individual citizens most usually specialise in one or another kind of political activity, but rarely combine them. The ESS survey confirms this individual-level finding. But our research here is not concerned with individual participation, but with a cross-national comparison of whole countries. The focus is not on whether this or that individual votes, but on the voting turnout of the whole nation state. The ESS is a rare and valuable resource for this sort of work (besides its value for individual-level research) because it covers more than 20 nations of varied kinds – a large enough number to produce statistically valid generalisations. The utility of survey research is highlighted by the fact that our reasonable and common-sense expectations of national uniqueness were by no means confirmed by the results. Each of our 22 countries tends to repeat its level of participation across 23 out of 25 different measures tapping five different types of participation (voluntary associations, conventional politics, protest politics, social life and helping behaviour). Countries with a high rate of participation on any one measure are likely to have a similar rate on most of the other measures. As a result, countries can be satisfactorily ranked on a single scale of participation.
Jowell-Chapter-10.qxd
228
3/9/2007
8:18 PM
Page 228
MEASURING ATTITUDES CROSS-NATIONALLY
In part this seems to be because countries are not unique; there are indeed families of countries with a strong resemblance to one another. The northern countries (Finland, Sweden, Norway, Denmark and the Netherlands) cluster together at the top of most of the participation tables. The Mediterranean countries (Greece, Portugal, Italy and Spain) generally have low participation rates on all the measures, as do the Central European countries (Poland, Hungary, Slovenia and the Czech Republic). Countries with a common border and comparable cultures also often have similar participation rates: Austria and Switzerland; Norway, Sweden and Finland; the Benelux countries; Ireland and the UK; Portugal and Greece; Hungary and Poland. There are, to be sure, some deviations from this general pattern, but the overall picture is rather simple and clear. The fact is that wealthy countries with effective and stable governments have high rates of all kinds of participation. Above all countries governed by the rule of law (low levels of crime, theft, corruption and black market activity, a predictable judiciary, and enforceable private and government contracts) are highly likely to show high rates of social, political and civic engagement. In other words, countries that are stable and orderly, rule-bound and law abiding, and comparatively crime and corruption free are likely to have high rates of participation, and these countries are also likely to be wealthy, with stable governments, high-quality public bureaucracies and good public services. It is not true that the welfare state drives out the humanitarian and charitable efforts of private associations and individuals. On the contrary, some of the most advanced welfare states in the study have the highest rates of such activity. Nor is it true that the welfare state colonises civil society and undermines natural forms of social solidarity, as some have suggested. On the contrary, some of the most advanced welfare states seem to have the highest levels of social solidarity, as measured by social contacts and helping behaviour. And the most developed civil societies, as measured by involvement in formally organised voluntary associations, tend to have the most intensive informal social relations and the most helpful and neighbourly relations among their citizens. Nor does it seem to be true that one form of participation is the functional equivalent of or alternative for another. Participation in organised social clubs and associations is generally high in countries that also have high rates of other forms of socialising. Similarly, countries with a strong third sector of charitable and humanitarian associations also tend to have a lot of helping behaviour on the part of private individuals, and vice versa. In this respect, another underlying conclusion of this research is that survey research is sometimes capable of overturning the perfectly reasonable
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 229
Patterns of political and social participation in Europe
229
expectations of common sense – what everybody knows to be true because they have their own experience of living in society. Sometimes conventional wisdom turns out to be wrong and sometimes it turns out to be right. We need empirical evidence from social surveys of this kind to be sure. References Barnes, S. and Kaase, M. et al. (1979), Political Action: Mass Participation in Five Western Democracies, London: Sage. Deflem, M. (eds) (1996), Habermas, Modernity and Law, London: Sage. Delhey, J. and Newton, K. (2005), ‘Predicting Cross-National Levels of Social Trust: Global Pattern or Nordic Exceptionalism?’, European Sociological Review, 21 (4), pp. 311–327. Inglehart, R. (1997), Modernization and Postmodernization, Princeton, NJ: Princeton University Press. Little, R. and Rubin, D. (2003), Statistical Analysis with Missing Data, New Jersey: Wiley. Parry G, Moyser, G. and Day, N. (1992), Political Participation and Democracy in Britain, Cambridge: Cambridge University Press. Verba S. and Nie, N. (1972), Participation in America: Political Democracy and Social Equality, New York: Harper and Row. Verba S, Nie, N. and Kim, J-O. (1978), Participation and Political Equality: A Seven Nation Comparison, Chicago: University of Chicago Press. Wolfe, A. (1989), Whose Keeper? Social Science and Moral Obligation, Berkely, CA: University of California.
Jowell-Chapter-10.qxd
3/9/2007
230
8:18 PM
Page 230
MEASURING ATTITUDES CROSS-NATIONALLY
Appendix 1: Principal component analysis of individual political participation in the United Kingdom Component Indicator Interested in politics Discuss politics Voted in last election Contacted politician Worked for political organisation Displayed campaign material Lawful demonstration Donated money to political organisation Signed petition Boycotted products Bought ethical products
1 0.130 0.081 −0.012 0.374 0.691 0.670 0.631 0.499 0.293 0.055 0.070
2 0.131 0.211 −0.080 0.160 −0.021 0.154 0.108 0.096 0.606 0.793 0.764
3 0.812 0.814 −0.172 0.332 0.166 −0.036 −0.016 0.234 0.008 0.173 0.254
4 −0.203 −0.031 0.939 0.023 0.127 −0.181 −0.056 0.040 −0.191 −0.019 0.039
Variance explained (%)
16.65
15.62
14.92
9.24
Note: The means of extraction in the tables in Appendix 1–6 is principal component analysis with varimax rotation. Varimax rotation is a statistical technique showing that the different components are orthogonal, that is, uncorrelated with one another. In the table above this means that citizens who engage in the activities covered by the first component of political activity tend not to be the same as those who are typically engaged in the activities covered by the second component
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 231
Patterns of political and social participation in Europe
231
Appendix 2: ESS questions used to measure 25 forms of social and political participation Voluntary associations ‘For each of the voluntary organisations I will now mention, please use this card to tell me whether any of these things apply to you now or in the last 12 months, and, if so, which?’ Types of activity: None, Member, Participated, Donated money, Voluntary work. Types of organisation: Sports club or club for outdoor activity; cultural or hobby activities; trade union; business, professional or farmers; consumer or automobile; humanitarian aid, human rights, minorities, or immigrants; environmental protection, peace or animal rights; religious or church; science, education or teachers and parents; social club for the young, the retired /elderly, or friendly society; political or action group; other. Conventional political participation ‘How interested would you say you are in politics – very/quite/hardly/not at all/don’t know?’ ‘Some people don’t vote nowadays, for one reason or another. Did you vote in the last [country] national election [last election of country’s primary legislative assembly]?’ ‘How often would you say you discuss [in the sense of discussing with friends or chatting about politics or policies at, for example, one’s workplace or in a bus queue to relative strangers] politics and current affairs? Every day/several times a week/once a week/several times a month/once a month/less often/don’t know.’ ‘There are different ways of trying to improve things in [country], or help prevent things from going wrong [prevent serious problems arising]. During the last 12 months have you done any of the following [Yes, no, don’t know]: Contacted a politician, government or local government official? Worked in a political party or action group? Worn or displayed a campaign badge/sticker? Donated money to a political organisation or group?’
Jowell-Chapter-10.qxd
3/9/2007
232
8:18 PM
Page 232
MEASURING ATTITUDES CROSS-NATIONALLY
Political protest behaviour ‘There are different ways of trying to improve things in [country], or help prevent things from going wrong [prevent serious problems arising]. During the last 12 months have you done any of the following [Yes, no, don’t know]: Signed a petition? Taken part in a lawful public demonstration? Boycotted certain products? Deliberately bought certain products for political, ethical, or environmental reasons?’ Meeting socially ‘… how often do you meet socially [meet by choice rather than for reasons of either work or pure duty] with friends, relatives or work colleagues? Every day/ several times a week /once a week / several times a month / once a month / less often/don’t know.’ Helping behaviour ‘Not counting anything you do for your family, in your work, or within voluntary organisations, how often, if at all, do you actively provide help for other people? Every day/ several times a week /once a week / several times a month /once a month / less often / never / don’t know.’
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 233
Patterns of political and social participation in Europe
233
Appendix 3: Principal component analysis of country rates of participation in twelve types of voluntary association Association
Business Culture Humanitarian Religious Sport Trade union Other voluntary Consumer Social club Environmental Political Science Variance explained (%)
Component 1 0.708 0.929 0.861 0.694 0.924 0.689 0.877 0.765 0.873 0.760 0.773 0.859
2 0.400 0.007 0.213 0.248 0.132 0.601 −0.193 −0.265 −0.085 −0.391 −0.136 −0.386
3 0.009 −0.150 −0.070 0.629 0.118 −0.212 0.089 −0.187 −0.295 0.342 −0.094 −0.082
66.200
9.000
6.210
Note: For technical reasons the figures in this table and in the other tables based on principal component analysis that follow, are based not on the individual percentage figures shown in Table 10.1, but on our standardised measure of participation
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
234
Page 234
MEASURING ATTITUDES CROSS-NATIONALLY
Appendix 4: Principal component analysis of country rates of participation informal and informal types of social activity and formal and informal helping behaviour Component Social involvement Humanitarian association Social clubs Meet socially Helping behaviour Variance explained (%)
1 0.90 0.86 0.85 0.69
2 −0.21 0.01 −0.36 0.71
3 0.07 −0.49 0.29 0.18
68.85
16.84
8.99
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 235
Patterns of political and social participation in Europe
235
Appendix 5: Principal component analysis of country rates of participation in seven kinds of conventional political activity Activity
Component
Discuss politics Donated money Interested in politics Worked for political organisation Displayed campaign material Contacted politician Voted in the last election
1 0.801 0.773 0.709 0.662 0.622 0.604 −0.043
2 0.457 −0.104 0.618 0.330 0.565 0.509 −0.162
3 0.076 −0.462 −0.223 0.230 0.238 0.000 0.958
Variance explained (%)
41.900
18.790
18.510
Notes: Extraction method is principal component analysis
Jowell-Chapter-10.qxd
3/9/2007
236
8:18 PM
Page 236
MEASURING ATTITUDES CROSS-NATIONALLY
Appendix 6: Principal component analysis of country rates of participation in four kinds of protest activity Activity Boycott products Buy ethical products Signed petition Lawful demonstration Variance explained (%)
Component 1 0.952 0.932 0.946 0.371
2 −0.166 −0.283 0.083 0.924
3 0.182 0.095 −0.312 0.090
70.190
24.190
3.680
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 237
Patterns of political and social participation in Europe
237
Appendix 7: Principal component analysis of country scores for 25 measures of participation Indicator
Component
Culture Sports association Other voluntary Social club Humanitarian association Bought ethical products Science association Signed petition Boycott products Consumer association Political association Environmental association Interested in politics Trade union Social meeting Religious or church organisation Donated to political organisation Business association Discuss politics Help others Contacted politician Displayed campaign material Worked for political organisation Lawful demonstration Voted in last election
1 0.938 0.896 0.878 0.872 0.869 0.846 0.841 0.841 0.757 0.757 0.740 0.735 -0.720 0.690 0.676 0.663 0.655 0.622 0.604 0.596 0.582 0.553 0.538 0.222 0.172
2 0.148 −0.027 −0.008 0.104 0.027 0.005 0.016 0.234 0.062 −0.177 0.006 −0.403 0.579 0.140 0.269 −0.202 −0.047 0.224 0.306 0.713 0.496 0.582 0.137 0.373 −0.294
3 −0.022 0.222 −0.112 −0.159 0.228 0.202 −0.291 −0.088 0.077 −0.248 −0.164 −0.185 0.103 0.515 −0.089 0.413 −0.438 0.309 0.133 0.155 0.319 0.072 −0.326 −0.577 0.562
Variance explained (%)
51.110
8.960
8.190
Jowell-Chapter-10.qxd
3/9/2007
8:18 PM
Page 238
Jowell-Chapter-11.qxd
3/9/2007
11
6:48 PM
Page 239
A continental divide? Social capital in the US and Europe Pippa Norris and James Davis ∗
Introduction Many observers believe that the United States has experienced a steep erosion of social capital during the post-war era, with a fall in generalised reciprocity (including social trust and social tolerance) and in social connectedness (including formal associational participation and informal socialising). Secular social trends are thought responsible for these developments, notably the rise of television entertainment. Putnam (2000) suggests that the haemorrhaging of social capital has had important consequences for civic participation in the United States, and thus for the health and vitality of American democracy. The key question is whether this phenomenon is another case of ‘American exceptionalism’, due to specific causes, or whether similar patterns are also evident in Europe. Recent decades have seen renewed interest in the world of voluntary associations and community associations. The core claim of ‘Toquevillian’ theories of social capital is that typical face-to-face deliberative activities and horizontal collaboration within voluntary organisations far removed from the political sphere – exemplified by sports clubs, social clubs and philanthropic groups – promote interpersonal trust, social tolerance and co-operative *
Pippa Norris is the Director of the Democratic Governance Group, United Nations Development Program and the McGuire Lecturer in Comparative Politics, Harvard University. James Davis is a Senior Lecturer in Sociology, University of Chicago and Senior Research Scientist, National Opinion Research Center.
Jowell-Chapter-11.qxd
240
3/9/2007
6:48 PM
Page 240
MEASURING ATTITUDES CROSS-NATIONALLY
behaviour. In turn, these norms are regarded as cementing the bonds of social life, creating the foundation for building local communities, civil society and democratic governance. In a ‘win-win’ situation, participation in associational life is thought to generate individual rewards, such as career opportunities and personal support networks, as well as facilitating community goods, by fostering the capacity of people to work together on local problems. As noted, ‘social capital’ is understood to combine generalised reciprocity (exemplified by social trust and social tolerance) and social connectedness (including ties developed through formal associational membership and through informal networks). Understanding the conditions conducive to the production of social capital is important, Putnam suggests, if this provides a valuable social resource. Communities rich in the norms and networks facilitating collaboration, it is claimed, enjoy many virtues, potentially becoming more efficient, prosperous, productive and democratic. Any long-term decline in this resource should therefore generate genuine concern. In Bowling Alone Robert Putnam (2000) has assembled a substantial battery of evidence suggesting that the United States experienced a steady and deep erosion of social capital during the post-war era. He warns that multiple indicators display a consistent secular fall in America since the 1960s and 1970s, including membership of voluntary associations, indicators of traditional political participation, civic attitudes, the strength of informal social ties, and levels of social trust. The causes of this phenomenon are complex but are thought to include the modern pressures of time and money, the movement of women into the paid work-force, stresses in the two-career family, geographic mobility and suburban sprawl, and the role of technology and the mass media. The ubiquity of television entertainment, in particular, is thought to play a critical role in individualising leisure hours for ‘couch potatoes’. Putnam’s work has stimulated extensive popular and scholarly debate about the nature of social capital, its causes and consequences, generating multiple studies of this phenomenon. Ladd (1996) and Schudson (1996) have questioned whether studies have exaggerated the degree of civic engagement in the past and overlooked more recent forms of expression and activism via new social movements. But Putnam’s account has also popularised this concept, generated a massive literature across all the social sciences, and attracted a favourable reception from many who believe that the quality of American civic life has suffered a substantial impoverishment among the post-war generation. The key question addressed by this chapter is whether similar patterns are also evident throughout comparable European democracies. After all, social capital is framed as a general theory about human behaviour and societal co-operation, not one designed to explain just the United States. Indeed, the origins for
Jowell-Chapter-11.qxd
3/9/2007
6:48 PM
Page 241
A continental divide? Social capital in the US and Europe
241
Bowling Alone were based on observations first developed to account for patterns of Italian regional government (Putnam, 1993). The existing literature remains divided on this question. For instance, based on the case studies of eight post-industrial societies including Britain, Sweden and Japan, Putnam (2002) concluded that different trends in social capital were evident in different countries, not just a single model that reflected developments in the United States. To examine these issues, the first part of this chapter outlines the theoretical framework and the reasons why recent decades may have seen declining social capital in the United States. We then describe the logic of the research design, the sources of evidence used to analyse this issue, drawing upon the US General Social Survey and the European Social Survey (ESS), 2002, and the measures used to compare social capital. We go on to compare the richness and vitality of social capital in Europe. And finally we analyse patterns of generalised reciprocity and social connectedness among older and younger cohorts. Toquevillian theories of social capital There is nothing particularly novel about expressing concern for the loss of community and the weakening of face-to-face relationships in modern society; a long tradition in sociological theory among thinkers as diverse as Emile Durkheim, Karl Marx, Max Weber, Ferdinand Tonnies and Georg Simmel has reflected on the erosion of Gemeinschaft in the family, neighbourhood and local community. Modern theories of social capital, originating in the ideas of Pierre Bourdieu (1970) and James Coleman (1988, 1990), have emphasised the importance of social ties and shared norms for societal well-being and economic efficiency. This study focuses on the way that Robert Putnam expanded and popularised this notion in Making Democracies Work (1993) and in Bowling Alone (2000), since this account develops the strongest claims about the importance of social capital for civic participation and for good governance in the public sphere. Putnam’s conceptualisation of social capital as “connections among individuals – social networks and the norms of reciprocity and trustworthiness that arise from them” (2000, p.19) combines both a structural phenomenon embodied in social connectedness (through formal associational memberships and through informal social ties) and a cultural phenomenon of generalised reciprocity (through the social norms of interpersonal trust and social tolerance). In combination the norms and networks are believed to facilitate collaboration on community problems of common concern (Figure 11.1).
Jowell-Chapter-11.qxd
3/9/2007
242
Figure 11.1
6:48 PM
Page 242
MEASURING ATTITUDES CROSS-NATIONALLY
The components of social capital
Note: See appendix at the end of this chapter for details of the constructions of these scales
A limitation of the literature is that subsequent studies have often focused on one or other dimension, but in doing so they may have failed to capture the power of the two features in combination. Four core claims lie at the heart of this theory.
Social networks and social trust matter for societal co-operation The primary claim is that horizontal networks embodied in civic society, and the norms and values related to these ties, have important consequences for the people in them and for society at large, producing both private and public goods. In particular, networks of friends, colleagues and neighbours are commonly associated with the norms of generalised reciprocity in a skein of mutual obligations and responsibilities. These dense bonds then foster the conditions for collaboration, coordination and co-operation to create collective goods. The shared understandings, tacit rules, agreed procedures and social trust generated by personal contact and the bonds of friendships are believed to make it easier for people to work together in future for mutual benefit. Since the value of social capital exists in the relations among people, its measurement needs to be at a societal level, but it is far more elusive than financial capital in company shares and factory machinery, or even human capital in cognitive skills (Arrow, 2000). Organisations in civic society such as unions, churches and community groups, Putnam suggests, play a vital role in the production of social capital where they succeed in bridging divisive social cleavages, integrating people from diverse backgrounds and
Jowell-Chapter-11.qxd
3/9/2007
6:48 PM
Page 243
A continental divide? Social capital in the US and Europe
243
values, promoting ‘habits of the heart’ such as tolerance, co-operation and reciprocity, and thereby contributing towards a dense, rich and vibrant social infrastructure.
Social capital has important consequences for democracy Putnam goes further than other contemporary theorists in arguing that social capital also has significant political consequences. The theory can be understood as a two-step model of how civic society directly promotes social capital, and how, in turn, social capital (the social networks and cultural norms that arise from civic society) facilitates civic participation and good governance. In particular, based on his analysis of Italian regional government (Putnam, 1993), he claims that abundant and dense skeins of associational connections and rich civic societies encourage good governance. Civic society and civic norms are believed to strengthen connections between citizens and the state, such as by encouraging political discussion and mobilising electoral turnout. When the performance of representative government is effective, then Putnam reasons that this should increase public confidence in the working of institutions like legislatures and the executive, maximising diffuse support for the political system. Good governance is also believed to foster strong linkages between citizens and governments that promote the underlying conditions for civic engagement and participatory democracy. The central claim is not that the connection between social and political trust operates at the individual level, so that socially trusting individuals are also exceptionally trusting of government, and indeed little evidence supports this connection (Newton and Norris, 2000; Newton, 2001). Rather, the associations between social and political trust should be evident at the societal level. Social capital is a relational phenomenon that can be the property of groups, local communities or nations, but not individuals. We can be rich or poor in social capital, I cannot.
Social capital has declined in post-war America In Bowling Alone Putnam presents an extensive battery of evidence that associational membership, informal social ties and civic engagement have sufferedsubstantial erosion in post-war America, along with norms of social trust, altruism and reciprocity. To check some of the evidence, key indicators of social trust, associational membership and informal sociability are summarised in Table 11.1. In these models, negative standardised regression coefficients confirm a decline over time in these indicators. But the results also suggest that these three separate components of social capital behaved in quite different ways during the last quarter of the twentieth century. So the
Jowell-Chapter-11.qxd
3/9/2007
244
6:48 PM
Page 244
MEASURING ATTITUDES CROSS-NATIONALLY
overall picture does not support the argument that social capital has declined consistently or strongly in America during this period. Table 11.1
US trends in indicators of social capital Informal sociability1
Associational membership2
Social trust3
First year of series Latest year Data points Number of cases Standardised regression coefficients
1974 2002 18 25,936
1972 1994 15 19,688
1974 2002 20 29,669
Model A: Year Model B: Year dummies Model C: Year Birth cohort
−.025**
−.008
−.070**
−.038** −.207 .396
−.032** −.003 −.011
−.090** .000 −.153**
Notes: The last three rows present the standardised regression (Beta) coefficients in a series of models. Significant coefficients are marked **; negative coefficients indicate declines in social capital 1 Informal sociability: Scale summarising average time spent on ‘social evenings with relatives, neighbours, friends’ or ‘visits to a bar or tavern’. 2 Associational membership: Scale from 0–16 reflecting memberships of each listed voluntary organisation, including fraternal groups, political clubs, hobby clubs and nationality groups 3 Social trust: Scale summarising average favourable answers on ‘most people can be trusted’, ‘most of the time people try to be helpful’, and ‘most people try to be fair’ Source: US General Social Survey, 1972–2002
To examine these patterns in greater detail, Figure 11.2 illustrates the fall in social trust from 1972–2000, as measured in the US General Social Survey. The pattern of social trust was erratic in the 1970s, displaying sharp trendless fluctuations (which may raise important questions about the measurement and stability of this social construct), but the overall picture from the start to the end of the time series displays a fall of about 10 per cent over three decades. Regression analysis confirms that the erosion was modest in size although statistically significant. Model C suggests that the erosion of social trust in America is entirely a cohort process, where the newer generations are less trusting, but within each cohort trust neither increases nor decreases. Formal associational membership is another important indicator of social capital. Bowling Alone presents extensive evidence based on official
Jowell-Chapter-11.qxd
3/9/2007
6:48 PM
Page 245
A continental divide? Social capital in the US and Europe
Figure 11.2
245
The decline in social trust in the US, 1972–2002
Note: Question: ‘Do you think that most people can be trusted, or that you can't be too careful in dealing with people?’ % agreeing that ‘Most people can be trusted’ Source: US General Social Survey, 1974–1994, weighted to compensate for over-sample of respondents from large households and over-sample of the black population in the early years
membership rolls from the historical records of a range of voluntary organisations, with the longest time series for groups such as unions and professional associations documented since 1900. Trends in associational membership in America commonly display a long-term rise until the early 1960s and then a secular fall over subsequent decades. Associational membership can be measured only from 1974 to 1994 using the General Social Survey, but here the trends are far less clear-cut. We can examine membership of seven types of organisation which are functionally equivalent to those contained in the European Social Survey, exemplified by sports clubs, professional societies and church groups.1 Figure 11.3 shows the trends in the US. It is clear from Figure 11.3 that most group memberships show a pattern of trendless fluctuation during these decades, not decline. Some types of association register modest gains from the start to the end of the series (such
1
It should be noted that this excludes certain American voluntary organisations, such as the Elks, where no equivalent mass membership associations exist in Europe.
Jowell-Chapter-11.qxd
3/9/2007
246
Figure 11.3
6:48 PM
Page 246
MEASURING ATTITUDES CROSS-NATIONALLY
Membership in voluntary associations in the US, 1974–1994
Note: These associations were selected as the most functionally equivalent to the list of associations included in the ESS Source: US General Social Survey, 1974–1994
as professional groups), while some experience a modest fall (notably unions (Labour)); these patterns probably reflect the major changes in the American labour force in the shift from blue-collar manufacturing jobs to the service sector economy. A fall in church associational membership is also evident. But, as Rotolo (1999) has also observed, contrary to Putnam the survey evidence does not display any consistent and sharp erosion of membership across all types of groups. It remains difficult to establish the validity of either set of trends but one reason for the disparity between could be due to the more limited period of data available from the GSS. A substantial fall in membership could have occurred prior to the start of the GSS series in the mid-1970s. The regression analysis models that we showed in Table 11.1 looked at trends in membership in a broader range of associations (16 types), and the results confirm that there is no significant trend either over time or over generation. The final component of social capital concerns informal patterns of sociability, exemplified by socialising with friends, neighbours and colleagues. It could be the case that the formal dimension of belonging to traditional organisations has
Jowell-Chapter-11.qxd
3/9/2007
6:48 PM
Page 247
A continental divide? Social capital in the US and Europe
247
become out-dated although the informal aspect of social life could still remain strong in America. The results of the regression models that were shown in Table 11.1 for the scale of informal sociability suggest that this has indeed suffered a small decline over time, similar to the erosion of social trust. Yet the pattern is complex. When both year and cohort are used as predictors, their effect on sociability works in opposite ways. Within cohorts, sociability erodes with time. Yet within years, new generations are more sociable than their predecessors. These separate effects essentially cancel each other out so that the overall net change is small. Putnam considers multiple causes for changes in social trust, sociability, and associational membership in America, such as the pressures of time and money, the growing sprawl of the suburbs and changes in the modern family and in women’s role in the workplace and at home. Although he concludes that many factors play a role, he regards changes in technology and the electronic media, particularly the rise of television entertainment as the primary source of leisure activity, as the major culprits for the erosion of social connectedness and civic disengagement in the United States, with the effects most profound among the younger generation (Putnam, 1995; 2000, p.246). During the 1950s, he argues, leisure in the US gradually moved from the collective experience of the movie theatre, urban street summer stoop, local diner and town hall meeting to become individualised by the flickering light of the television tube. The individualisation of leisure has led, he suggests, to a deep-seated retreat from public life.
Social capital in advanced industrialised societies Are similar trends evident in Europe? If secular social trends have eroded engagement in community life in post-war America, such as the growing pressures of time and money in modern societies, or the rise of entertainment television as the chief recreation during leisure hours, then similar developments should surely be evident in nations such as Britain, Norway, the Netherlands, and others. The patterns might be more advanced in the United States, but Europe should show some indication of growing social mistrust and civic disengagement. Thus, any robust generalisations about the causes of trends in social capital should apply across a wide range of comparable societies. So, if European societies have experienced similar social trends to those in the United States, notably in the rise of technology and the electronic media, as well as changes in the family and women’s role in the paid workplace, then logic suggests that a parallel fall in social capital should have occurred in Europe as well. Yet there are also many good reasons why America might prove ‘exceptional’. Lipset (1996) highlighted why the United States remains distinctive socially and politically from most other established democracies, even from its
Jowell-Chapter-11.qxd
248
3/9/2007
6:48 PM
Page 248
MEASURING ATTITUDES CROSS-NATIONALLY
Canadian neighbours. Could not contemporary patterns of social capital be essentially ‘path-dependent’? In particular, might not social trust and membership of voluntary associations be heavily influenced by the cultural traditions, historical experiences, and the role of the state in each society? In a study of Denmark, for example, Torpe (2003) found no general weakening in social capital (measured by civic norms, social trust, civic involvement and social networks), and he argued that institutional structures could explain this pattern, notably that the strength of the Danish welfare state facilitates the production of social capital. Moreover, patterns of associational membership could depend upon the organisational sectors predominant in any given society, such as the strong role of trade unions in Norway and Sweden, the influence of the Catholic Church in Ireland, the position of military veterans’ associations in the United States (Van Deth, 1997), and the resistant class structure of the UK (Johnston and Jowell, 2001). The persistent strength of religiosity and the role of the church in America also differ sharply from processes of secularisation in Western Europe and in post-Communist societies (Norris and Inglehart, 2004). The role of civil society is also expected to differ sharply given the history of post-Communist nations, where the Soviet Union co-opted voluntary organisations such as trade unions and youth groups to mobilise support for the party (Rose et al., 1997). In Central European states such as Poland, where the church took an oppositional stance to the state, religious organisations remained stronger after independence, whereas in Hungary the credibility of the Catholic Church was eroded by collaboration with the Communist government (Borowik, 2002). Despite the intellectual origins of his theory in his Italian study, Putnam is wary of claiming that social capital has eroded beyond American borders. In his recent eight-nation comparative study, he concludes that a diverse picture is evident in established democracies such as Sweden, Britain, Japan and France: “Our investigation has found no general and simultaneous decline in social capital throughout the industrial/post-industrial world over the last generation” (Putnam, 2002, p.410). The evidence suggests that certain common trends are found in many of the ESS participating countries, including waning participation in elections and declining membership of political parties, unions and churches. But Putnam remained agnostic as to whether a growth in newer and more informal mechanisms of social connectedness had occurred and, if so, what the consequences of this development might be for the pursuit of collective goals and social inequality. Other comparative studies of membership and activism in voluntary associations in Western Europe – including in interest groups, NGOs and new social movements – also suggest considerable cross-national diversity. Falling membership trends are evident in parties and churches (due, respectively, to processes of partisan de-alignment and secularisation), yet during recent decades an expansion in new social movements (such as environmentalist
Jowell-Chapter-11.qxd
3/9/2007
6:48 PM
Page 249
A continental divide? Social capital in the US and Europe
249
and women’s groups) is also widely evident (Van Deth, 1997). For these reasons we need to confirm whether the erosion of social capital is yet another case of American exceptionalism, or whether similar patterns are evident in Western and post-Communist Europe. Evidence and measures Putnam’s thesis contains strong claims about the components of social capital that are open to empirical testing across a range of advanced industrialised societies. These claims lay the foundations for developing a reliable and valid measure of social capital, which can then be used for comparison across countries. His arguments suggest that any measure needs to take account of both structural and cultural dimensions of social capital simultaneously, that is, the strength of social networks (measured by formal associational membership and informal friendships), and the cultural norms of generalised reciprocity (measured by feelings of social trust and tolerance). To examine the European evidence, this chapter draws upon the European Social Survey, 2002.2 As a new survey, we cannot use it to examine longitudinal trends during the post-war era. As a proxy measure, however, we can compare patterns of social capital by cohort of birth. If we discover substantial differences between older and younger cohorts, we interpret these as indicators of social trends. Traditional theories of socialisation suggest that habitual patterns of political behaviour are generally acquired during an individual’s formative years – in the family, school, workplace and local community – and that these habits gradually rigidify over time, creating persistent differences among successive birth cohorts. If social capital has eroded over time, and if young people acquired social norms of behaviour which have proved fairly stable over their lifetimes, then, compared with their parents and grandparents, young people should display different patterns of social capital which persist as they age. Along similar lines, Putnam uses certain broadly linear trends in civic activism over successive age cohorts, for example steadily falling levels of voting turnout or membership of PTAs, to support a generational interpretation of the American evidence. It follows that, owing to distinctive historical experiences, any cohort analysis that shows strong contrasts by nation or type of society, such as major differences between Western Europe and post-Communist societies, should also illuminate the factors behind the patterns we uncover. 2
Data, along with comprehensive documentation, are accessible at http://ess.nsd.uib.no. For more details, including the questionnaire and methodology, see www.europeansocialsurvey.org. We are most grateful to the European Commission and the ESF for their support for this project and to the work of the ESS Central Coordinating Team, led by Roger Jowell, for making the data available.
Jowell-Chapter-11.qxd
250
3/9/2007
6:48 PM
Page 250
MEASURING ATTITUDES CROSS-NATIONALLY
Of course any differences by cohort could also be attributable to life-cycle effects which are produced by the experience of changing individual circumstances. For instance, patterns of civic activism, social networks and associational membership may all alter when people leave home for educational and work-force opportunities, or when they start their own families and settle down within a local community, or when they eventually enter retirement. There may also be period effects which can be attributed to a particular major historical event that had a decisive impact upon all citizens in a society (or many societies) at one point in time, such as the Great Depression in the 1920s and 1930s, or the end of World War II, or the 1960s’ student movements, or the transition to democracy in post-Communist Europe. The methodological challenge is to disentangle the separate components of social change (Mason and Fienberg, 1985; Alwin and Krosnick, 1991). Ideally we need time-series data from panel surveys which monitor patterns of political activism among the same individuals as they gradually age. But unfortunately panel surveys monitoring social capital are rare within single nations and simply unavailable cross-nationally. In any case there are limits to monitoring consistent trends over time with a succession of cross-sectional surveys. And in our case, as the first systematic studies monitoring social trust and associational activism in Western Europe only started in the late-1960s or early 1970s and only during the 1990s for post-Communist societies, the problem is greater still. So, in the absence of panel and time-series data, we have analysed the 2002 European Social Survey using two approaches. First, we compare regression models of linear age effects (implying secular trends that progress steadily from young people to the elderly), with contrasts between younger, middle-aged and older cohorts (using dummy variables to test for curvilinear life-cycle patterns peaking in middle age) to explore which provides a better fit to the data. If linear models prove stronger, they suggest (but do not prove) substantial intergenerational differences in social capital, indicating the direction in which prevailing trends are moving (De Graaf, 1999). Secondly, we can also compare age-related patterns of activism in many European societies to see whether historical experiences leave a distinct imprint. Many of the countries under comparison are long-established democracies, although Spain, Greece and Portugal became consolidated democracies only in the 1970s, while the post-Communist states of Slovenia, the Czech Republic, Hungary and Poland had their first free elections during the early 1990s. So, if the experience of democracy during each cohort’s formative years stamps a lasting impression on political attitudes and behaviour, then we would expect to observe contrasting patterns of activism by age cohort when comparing established democracies, newer democracies and post-Communist nations.
Jowell-Chapter-11.qxd
3/9/2007
6:48 PM
Page 251
A continental divide? Social capital in the US and Europe
251
Comparing social capital in Europe To start to examine the European evidence we can first describe the basic distribution of social capital in the societies under comparison. In particular we can demonstrate what associations are most popular in Europe and also how nations vary across the two key dimensions of social capital, namely in formal social networks (measured by associational membership) and also the norms of generalised reciprocity (social trust). On this basis we can then use cohort analysis to establish whether patterns vary systematically by age group. Figure 11.4
Participation in voluntary associations by sector, ESS-2002
Note: Question: “For each of the voluntary organisations I will now mention, please use this card to tell me whether any of these things apply to you now or in the last 12 months, and, if so, which.” Source: European Social Survey, 2002. Integrated dataset. Weighted by dweight and pweight
Jowell-Chapter-11.qxd
252
3/9/2007
6:48 PM
Page 252
MEASURING ATTITUDES CROSS-NATIONALLY
Figure 11.4 shows participation in a dozen different types of voluntary association in the integrated 22-nation European data, including the proportion who reported that they were members, or participated in the activities, or donated money, or they did voluntary work – each of which can be regarded as progressively more demanding forms of civic involvement. The comparisons show that certain groups were clearly attracting the most widespread membership, notably sports clubs in which one-fifth of Europeans reported membership. Nor was this merely passive ‘cardcarrying’ membership; sports clubs were also the associations where many people were highly engaged, whether as team supporters or active participants. Other associations attracted smaller proportions of engagement, although consumer groups, trade unions and cultural/hobby groups were all relatively popular, followed by belonging to a religious organisation, social club or educational group. The groups representing ‘newer’ social movements – notably groups concerned with the environment or peace – proved less popular in membership and activism, although many reported donating funds to these types of organisation. Political parties were overall the least popular types of voluntary association in terms of all types of participation combined. The erosion of party membership in established democracies, documented in the historical records of official membership rolls, appears to have shrunk the mass basis of contemporary political parties in civil society (Mair and van Biezen, 2001; Scarrow 2001). The more detailed breakdown of membership in different types of voluntary association by country is shown in Table 11.2. It confirms that these patterns certainly vary across European nations, with membership far higher in the smaller European welfare state societies, notably Sweden, Denmark, Norway, the Netherlands and Austria, where most people belonged to two or more types of organisation. This was sustained in large part by the density of trade union membership in these nations (see Western 1994; Ebbinghaus and Visser, 1999; Blashke, 2000; Norris, 2002). In contrast, the countries in which associational membership remains weakest include countries in Central Europe (Poland and Hungary) as well as in Mediterranean Europe (Greece and Portugal). This suggests that differences in civil society do not represent a simple dichotomy between Western and postCommunist Europe, or between older and newer democracies, or even between Protestant, Catholic and Orthodox societies. More complex patterns are evident.
Sports Consumer/ Trade Cultural/ Church/ Social Science/ Prof/ Humanitarian/ Pol Environment/ Mean club automob union hobby religious club education farmers human rights party peace Other membership
US (1994)
22
37 18 33 32 33 46 9 9 28 32 5 24 4 9 4 2 6 3 0 1 0 15
56 64 47 22 22 22 19 28 14 16 46 14 9 19 6 10 9 6 5 3 6 14 12
25 26 22 19 18 22 18 22 17 16 13 13 18 8 11 11 7 5 6 3 3 13
15 27 15 26 32 6 25 7 19 14 25 5 5 8 6 5 5 6 1 5 3 11
9
33
19 18 22 10 19 17 16 20 13 16 9 11 10 0 7 6 5 5 3 3 2 10
11 7 8 10 10 11 8 8 6 7 5 9 8 5 8 4 2 3 4 1 2 10
9 14 15 13 10 11 16 9 9 13 12 8 4 8 5 4 9 3 5 2 1 8
14 11 17 8 8 9 5 7 6 4 4 3 6 5 4 1 4 1 1 2 0 5
8 6 9 5 11 6 4 6 3 3 6 8 2 4 3 4 3 2 4 3 2 3
19
Note: Mean membership is the average of all memberships in each nation Source: European Social Survey, 2002. Weighted by dweight and pweight: US General Social Survey, 1994
7 12 5 20 13 14 4 8 6 6 2 3 5 1 2 2 3 0 1 1 1 5
11 7 14 12 11 6 6 8 7 5 7 7 7 5 3 8 2 2 2 4 3 5
2.49 2.46 2.40 2.21 2.11 1.94 1.63 1.59 1.59 1.57 1.55 1.23 0.97 0.89 0.70 0.68 0.64 0.41 0.38 0.36 0.28 1.16
Page 253
39 36 32 46 26 25 34 29 32 27 22 17 21 16 12 13 8 5 4 7 4 21
6:48 PM
Sweden Denmark Norway Netherlands Austria Luxembourg Ireland Belgium Germany Britain Finland Israel France Slovenia Spain Czech Rep Italy Hungary Greece Portugal Poland Total
3/9/2007
Country
Membership in voluntary associations, by country, ESS-2002
Jowell-Chapter-11.qxd
Table 11.2
Jowell-Chapter-11.qxd
3/9/2007
254
6:48 PM
Page 254
MEASURING ATTITUDES CROSS-NATIONALLY
Figure 11.5 compares the 22 ESS countries on a more systematic basis by contrasting both dimensions of social capital in each nation – mean levels in memberships of all types of voluntary associations and mean levels of social trust. The comparison confirms that the distribution of social capital varies substantially within Europe. The Nordic nations and smaller welfare states – all affluent, established democracies – emerge in the top-right quadrant of the scattergram as the richest across both dimensions of social capital. And the European countries which proved lower in social capital include the newer democracies and less well-off nations – in both post-Communist and Mediterranean Europe. The fact that these data replicate and strongly reflect the pattern found previously using similar comparisons of social trust and associational membership in the same countries in the 1995 World Values Survey lends considerable confidence to the stability and consistency of this relationship and the robustness of the measures employed (Norris, 2002). Figure 11.5
Key dimensions of European social capital, ESS-2002
Note: The mean level of membership in 12 types of voluntary association and the mean score on the social trust scale by nation. See appendix at the end of this chapter for the construction of these scales Source: European Social Survey, 2002. Weighted by allwt
Jowell-Chapter-11.qxd
3/9/2007
6:48 PM
Page 255
A continental divide? Social capital in the US and Europe
255
Cohort analysis of social capital But the heart of this study concerns whether there has been an erosion of social capital in Europe similar to that which is thought to have occurred in the United States. By relying on the cross-national European Social Survey, without time-series benchmarks from earlier years, we cannot establish the patterns conclusively. But as noted, we can compare patterns among successive cohorts of birth as a proxy measure indicating age-related patterns of social change. If there are linear patterns evident from older to younger cohorts, with young people less willing to join associations or to trust others, this can be interpreted as indirect evidence of generational shifts in social capital. If, on the other hand, curvilinear patterns are evident by age group then this suggests (although it cannot prove) a life-cycle effect. Table 11.3 presents the simple mean scores on each of the four scales in the integrated ESS-2002 sample without any prior controls. It estimates the significance of mean differences among age groups by ANOVA. Table 11.3
Social capital by age group, ESS-2002 Generalised Reciprocity
Age group Younger (18–29) Middle-aged (30–59) Older (60+) Total mean Total N. Total Std. Dev. ANOVA Eta ANOVA Sig.
Social Connectedness
Social trust scale (mean)
Social tolerance scale (mean)
Membership in voluntary associations (mean)
Frequency of meeting socially with friends, relatives or colleagues (mean)
14.5 14.6 14.9 14.6 35745 5.66 .025 ***
23.7 23.1 21.1 22.7 32447 7.94 .115 ***
0.88 1.30 1.09 1.17 34799 1.57 .111 ***
2.64 2.20 2.12 2.26 36200 0.74 .214 ***
Note: The significance of difference between age groups (without any controls) was measured by ANOVA. *** Sig. =.001. For the construction of the four scales see appendix at the end of this chapter. Source: European Social Survey, 2002. Integrated data for 22 nations weighted by dweight and weight
The results illustrate the basic age differences and show complex patterns across all indicators. The social trust scale confirms that the older age group are most trusting. The modest differences between groups are nonetheless statistically significant. This pattern is consistent with a secular erosion in social trust, as social capital theory predicts. Yet the social tolerance scale also shows a small but significant age gap, and this time it is the older group which proved least tolerant,
Jowell-Chapter-11.qxd
3/9/2007
6:48 PM
256
Page 256
MEASURING ATTITUDES CROSS-NATIONALLY
not more. Membership of voluntary associations proved most popular among the middle-aged, in a curvilinear pattern. Lastly, when it comes to the informal dimension of social networks – as measured by the frequency of meeting with friends, relatives or colleagues – they were significantly more common among the youngest age group; the informal networks weakened with age. A series of OLS regression models are used with the pooled data to assess the impact and significance of linear age effects, and alternatively the dummy variables for three major age groups, on the core indicators of social capital (the results of which are shown in Table 11.4). The full regression models include prior controls (not shown here) for the standard variables that commonly influence patterns of social capital, including gender, education, income and length of residency in an area. The coefficients can therefore be understood to represent the impact of age on key dimensions of social capital net of other standard social and demographic characteristics. Table 11.4 Multivariate models of the impact of age (in years) on indicators of social capital, ESS-2002 with controls Model A Age effects (in years) B GENERALISED RECIPROCITY Social trust .026 scale Social tolerance −.028 scale SOCIAL CONNECTED NESS Associational .010 membership scale Informal social −.009 ties scale
Model B Age effects (younger cohort)
s.e.
Beta
Sig.
Adj.R2
B
.002
.079
***
.070
1.13
.083
.003 −.059
***
.092
−.818
.001
.105
***
.177
.244
.001 −.217
***
.051
−0.14
Beta
Sig.
Adj.R2
.087
***
.072
.118 −.044
***
.090
.022
.067
***
.170
.011 −.083
***
.013
s.e.
Note: The table includes the results of ordinary least squares regression analysis where Model A analyses the effects of age (in years) and Model B the effects of the younger cohort (under 30) on the four indicators of social capital. The coefficients represent unstandardised betas (B), standard errors (s.e.), standardised betas, the significance of the coefficients, and the total variance explained by the full models (adjusted R2). Details about the construction of the four scales are given in the appendix at the end of this chapter. The full regression models include controls (not shown here) for the standard variables that commonly influence patterns of social capital, including gender, education, income and years resident in an area Source: European Social Survey, 2002. Pooled data for 22 nations weighted by dweight and pweight
Jowell-Chapter-11.qxd
3/9/2007
6:48 PM
Page 257
A continental divide? Social capital in the US and Europe
257
The results in Table 11.4 confirm that age remained a significant predictor of each of these four indicators of social capital, even after applying controls. But an important qualification needs to be made to this observation: when it came to the informal social networks scale and social tolerance, the relationship proved negative. So young people are less likely to become card-carrying members of traditional civic associations and less likely to be trusting, but they are at the same time more likely to socialise through friendships and informal ties and to be socially tolerant. Figure 11.6
Social trust by cohort of birth, ESS-2002
Note: See appendix at the end of this chapter for the construction of the social trust scale. The figure displays mean trust by cohort and the regression line of the series. Cohorts by year of birth: 2 =1920–29, 3=1930–39, 4=1940–49, 5=1950–59, 6=1960–69, 7=1970–79, 8=1980–89 Source: European Social Survey, 2002. Weighted by allwt
Jowell-Chapter-11.qxd
3/9/2007
258
6:48 PM
Page 258
MEASURING ATTITUDES CROSS-NATIONALLY
But are these patterns consistently found across all different types of society? Visual comparison of the mean distribution by age cohort within each nation, and the regression line summarising the best fit, helps us to examine this question. Figure 11.6 illustrates social trust by cohort of birth, and Figure 11.7 the pattern of cohort effects in social tolerance. Figure 11.6 shows (and the regression coefficients confirm) that in twothirds of the nations under comparison there are no significant differences Figure 11.7
Social tolerance by cohort of birth, ESS-2002
Note: See the appendix at the end of this chapter for the construction of the social tolerance scale. The figure displays mean tolerance by cohort and the regression line of the series. Cohorts by year of birth: 2=1920–29, 3=1930–39, 4=1940–49, 5=1950–59, 6=1960–69, 7=1970–79, 8=1980–89 Source: European Social Survey, 2002. Weighted by allwt
Jowell-Chapter-11.qxd
3/9/2007
6:48 PM
Page 259
A continental divide? Social capital in the US and Europe
259
in mean levels of social trust by age cohort. A significant fall in social trust by age cohort was found in Ireland, the UK, Norway and Sweden, while a rise occurred in Greece and Poland The pattern of cohort effects in social tolerance, as shown in Figure 11.7, was far clearer: out of the nations under comparison, social tolerance increased significantly among the young in 17 countries. In some nations there was no significant pattern by age, and in only one (Israel) tolerance was Figure 11.8 Frequency of social meetings with friends, relatives or work colleagues, ESS-2002
Note: “Using this card, how often do you meet socially with friends, relatives or work colleagues?” Coded from Never (1) to Everyday (7). The figure shows mean frequency by cohort and the regression line of the series. Cohorts by year of birth: 2=1920–29, 3=1930–1939, 4=1940–49; 5=1950–59, 6=1960–69, 7=1970–79, 8=1980–89. Source: European Social Survey, 2002. Weighted by allwt
Jowell-Chapter-11.qxd
3/9/2007
260
6:48 PM
Page 260
MEASURING ATTITUDES CROSS-NATIONALLY
greatest among the older cohorts, which may reflect complex patterns of immigration and multiculturalism in this nation. The frequency of informal social meetings is shown in Figure 11.8, and it also increases among the younger cohorts in every country, with some of the most marked patterns evident in post-Communist societies. Finally, in terms of membership of voluntary associations Figure 11.9 shows the curvilinear pattern by age in all the older democracies, where membership
Figure 11.9
Membership in voluntary associations by cohort, ESS-2002
Note: See the appendix at the end of this chapter for the construction of the membership in voluntary organisation scale. The figure displays mean associational membership by cohort and the regression line of the series. Cohorts by year of birth: 2=1920–29, 3=1930–39, 4=1940–49, 5=1950–59, 6=1960–69, 7=1970–79, 8=1980–89 Source: European Social Survey, 2002. Weighted by allwt
Jowell-Chapter-11.qxd
3/9/2007
6:48 PM
Page 261
A continental divide? Social capital in the US and Europe
261
usually peaks among the middle-aged. The generalisation that young people are consistently lower in social capital than their parents and grandparents is not consistently substantiated by these four indicators. Instead, there are important variations, not only by country, but also by type of indicator. Conclusions There are multiple reasons to believe that many of the factors that have led to the erosion of social capital in the United States might generate similar effects in Europe. After all, although public service television retains a stronger share of the market in Europe, the growth of commercial television and the ubiquitous impact of television entertainment have transformed leisure habits in many European countries too (Norris, 2000). Similarly, powerful social trends such as the spread of suburban sprawl are now as much a characteristic of Stockholm, Geneva and Dublin as they are of Atlanta, Boston and Chicago. Traditional sex roles have been transformed in European societies by women’s participation in higher education and the paid work-force, by declining female fertility, and by major changes within the family (Inglehart and Norris, 2003). If all these secular social trends are powerful forces in transforming social trust and associational participation in the United States, they might plausibly be expected to generate similar developments among post-industrial societies across the Atlantic. Yet the key findings about social capital that emerge from this comparison of 22 diverse European nations are fourfold: 1. Substantial cross-national differences are evident among countries in their reservoirs of social capital, as measured by levels of social trust and associational membership. These contrasts are not random; instead social capital is richest in the older democracies and among the smaller welfare states in the Nordic region and Western Europe. Social capital is weaker among the poorer European states found in Mediterranean and postCommunist Europe. This suggests that contemporary patterns of social capital may well be ‘path-dependent’, strongly reflecting the legacy of historical experiences, the role of the state, and socio-economic development in these countries. 2. Social trust is similar among younger and older cohorts in most European countries, although the English-speaking and some Nordic states have experienced a consistent secular fall by age. 3. At the same time social tolerance and (especially) informal social networks are generally far stronger among younger than older cohorts (not weaker, contrary to the social capital thesis).
Jowell-Chapter-11.qxd
3/9/2007
262
6:48 PM
Page 262
MEASURING ATTITUDES CROSS-NATIONALLY
4. Significant age effects are found in established democracies where associational membership displays a curvilinear pattern by age cohort, suggesting a life-cycle effect as both the young and the elderly are less likely to join than are the middle-aged. The generalisability of these results needs to be confirmed using a broader range of indicators, including patterns of civic engagement and political activism. Nevertheless, the broad picture of the distribution of social capital that emerges from this analysis is reasonably consistent across the societies under comparison. The most striking conclusion is that if, as Putnam suggests, an erosion in social capital has occurred among the younger generation in the United States, driven by trends such as suburbanisation and the spread of TV, similar developments are not apparent across all countries in Europe. True, in some established European democracies young people are less trusting than older cohorts, and they are less likely to join associations than are their parents. Yet at the same time the young are generally more socially tolerant and throughout Europe they are also far more connected with friends, family and colleagues through informal ties. Younger Europeans are far from socially isolated; instead they are richer in the bonds of friendship. So the decline in social capital does appear to be another case of ‘American exceptionalism’, due to its distinctive culture and institutional structure, rather than a civic plague spreading through similar societies in Europe. References Alwin, D.F. and Krosnick, J. A. (1991), ‘Aging, cohorts, and the stability of sociopolitical orientations over the life-span’, American Journal of Sociology, 97 (1), pp.169–195. Arrow, K.J. (2000), ‘Observations on social capital’ in: P. Dasgupta and I. Serageldin (eds), Social Capital: A Multifaceted Perspective, Washington, DC: The World Bank. Blashke, S. (2000), ‘Union density and European integration: Diverging convergence’, European Journal of Industrial Relations, 6 (2), pp.217–236. Borowik, I. (2002), ‘The Roman Catholic Church in the Process of Democratic Transformation: The case of Poland’, Social Compass, 49 (2), pp.239–252. Bourdieu, P. (1970), Reproduction in Education, Culture and Society, London: Sage. Coleman, J.S. (1988), ‘Social capital in the creation of human capital’, American Journal of Sociology, 94, pp.95–120. Coleman, J.S. (1990), Foundations of Social Theory, Cambridge: Belknap. De Graaf, N.D. (1999), ‘Event history data and making a history out of cross-sectional data – How to answer the question “Why cohorts differ?”, Quality and Quantity, 33 (3), pp.261–276. Ebbinghaus, B. and Visser, J. (1999), ‘When institutions matter: Union growth and decline in Western Europe, 1950–1995’, European Sociological Review, 15 (2), pp.135–158.
Jowell-Chapter-11.qxd
3/9/2007
6:48 PM
Page 263
A continental divide? Social capital in the US and Europe
263
Inglehart, R. and Norris, P. (2003), Rising Tide, Cambridge: Cambridge University Press. Johnston, M. and Jowell, R. (2001), ‘How robust is British civil society?’ in: A. Park, J. Curtis, K. Thomson, L. Jarvis and C. Bromley (eds), British Social Attitudes: the 18th Report, London: Sage. Ladd, C.E. (1996), ‘The Date Just Don’t Show Erosion of America’s Social Capital’, The Public Perspective, 7 (4), pp.1–22. Lipset, S.M. (1996), American Exceptionalism: A Double Edged Sword, New York: W.W. Norton. Mair, P. and van Biezen, I. (2001), ‘Party membership in twenty European democracies 1980–2000’, Party Politics, 7 (1), pp.7–22. Mason, W.M. and Fienberg, S.E. (1985), Cohort Analysis in Social Research, New York: Springer-Verlag. Newton, K. (2001), ‘Trust, Social Capital, Civic Society, and Democracy’, International Political Science Review, 22 (2), pp.201–214. Newton, K. and Norris, P. (2000), ‘Confidence in Public Institutions: Faith, Culture or Performance?’ in: S. Pharr and R. Putnam (eds), Disaffected Democracies: What’s Troubling the Trilateral Countries?, Princeton, NJ: Princeton University Press. Norris, P. (2002), Democratic Phoenix: Reinventing Political Activism, Cambridge: Cambridge University Press. Norris, P. and Inglehart, R. (2004), Sacred and Secular: Religion and Politics Worldwide, Cambridge: Cambridge University Press. Putnam, R.D. (1993), Making Democracy Work: Civic Traditions in Modern Italy, Princeton, NJ: Princeton University Press. Putnam, R.D. (1995), ‘Tuning In, Tuning Out: The Strange Disappearance of Social Capital in America’ PS: Political Science and Politics, XXVIII (4), pp. 664–683. Putnam, R.D. (2000), Bowling Alone: The Collapse and Revival of American Community, NY: Simon and Schuster. Putnam, R.D. (eds) (2002), Democracies in Flux, Oxford: Oxford University Press. Rose, R., Mishler, W. and Haerpfen, C. (1997), ‘Social capital in civic and stressful societies’, Studies in Comparative International Development, 32 (3), pp.85–111. Rotolo, T. (1999), ‘Trends in voluntary association participation’, Nonprofit and Voluntary Sector Quarterly, 28 (2), pp.199–212. Scarrow, S. (2001), ‘Parties without Members?’ in: R.J. Dalton and M. Wattenberg (eds), Parties without Partisans, New York: Oxford University Press. Schudson, M. (1996), ‘What if civic life didn’t die?’, The American Prospect, 25, pp.17–20 Torpe, L. (2003), ‘Social capital in Denmark: A deviant case?’, Scandinavian Political Studies, 26 (1), pp.27–48. Van Deth, J.W. (1997) (ed.), Private Groups and Public Life: Social Participation, Voluntary Associations and Political Involvement in Representative Democracies, London: Routledge. Western, B. (1994), ‘Institutionalised mechanisms for unionisation in 16 OECD countries: An analysis of social survey data’, Social Forces, 73 (2), pp.497–519.
Jowell-Chapter-11.qxd
3/9/2007
264
6:48 PM
Page 264
MEASURING ATTITUDES CROSS-NATIONALLY
Appendix: Measures and scales Variable
Careful
Fair
Helpful
Social trust scale
Item
SOCIAL TRUST “Using this card, generally speaking, would you say that most people can be trusted, or that you can’t be too careful in dealing with people? Please tell me on a score of 0 to 10, where 0 means you can’t be too careful and 10 means that most people can be trusted.” “Using this card, do you think that most people would try to take advantage of you if they got the chance, or would they try to be fair?” “Would you say that most of the time people try to be helpful or that they are mostly looking out for themselves? Please use this card.” Summary Scale of Careful + Fair + Helpful
Coding
10-pt scale
10-pt scale
10-pt scale
30-pt scale
SOCIAL TOLERANCE Social tolerance scale
A scale combining the following items: Cronbach 45-pt scale Alpha= .81) “Using this card, would you say that people who come to live here generally take jobs away from workers in [this country] or generally help to create new jobs?” “Would you say it is generally bad or good for [this country]’s economy that people come to live here from other countries?” “And, using this card, would you say that [country]’s cultural life is generally undermined or enriched by people coming to live here from other countries?” “Is [country] made a worse or a better place to live by people coming to live here from other countries?” “If a country wants to reduce tensions it should stop immigration.” ASSOCIATIONAL PARTICIPATION
Participation in “For each of the voluntary organisations I will now voluntary mention, please use this card to tell me whether any of organisations these things apply to you now or in the last 12 months, and, if so, which?” “ A sports club; an organisation for cultural or hobby activities; a trade union; a business, professional or farmers’ organisation; a consumer or automobile organisation; an organisation for humanitarian aid, human rights, minorities or immigrants; an organisation for environmental protection, peace or animal rights; a religious or church organisation; a political party; an organisation for science, education, or teachers and parents; a social club for the young, the retired/elderly, women or friendly societies; any other voluntary organisations?” Informal social ties
[Code all that apply within this organisation]None (0), member (1), participated (2), donated money (3), did voluntary work (4)
INFORMAL SOCIAL NETWORKS “Using this card, how often do you meet socially with friends, relatives or work colleagues?”
Never (1) to Every day (7)
Jowell-Index.qxd
3/9/2007
8:08 PM
Page 265
Index
access to data 24–6, 137–55, 160–1, 164–5 achievement values 174–5, 176, 181, 183, 184, 189, 190, 203 age 188, 203 and social capital 194, 250, 254–60, 261 of survey participants 37, 128, 147 and values 188–9, 190, 197 agenda-setting 97 agricultural sector 224, 226 ALLBUS 158 anonymity 139, 140 ASD (ask different questions) models 78 Asian tsunami 96, 100, 105 ASQ (ask the same questions) models 78–9 associational membership 128, 210–17 passim, 231, 233, 239–54 passim, 265 age-related patterns 254, 255, 256, 260 country-level predictors 227, 228–9 values and 194–5 ‘at-home’ patterns 119, 122–3 Austria 11, 103, 105, 128 contact attempts 120, 121, 122, 123 fieldwork periods 116 political participation 207, 208, 217–23 passim, 228 refusal conversion 124, 125, 130 response, refusal and non-contact rates 117, 120, 123 sampling 37, 38, 39, 40, 43, 44, 45, 46, 47 social participation 208–9, 216, 221, 222, 223, 228, 259 associational membership 211, 212, 216, 252, 253, 254, 260 social tolerance 258 social trust 254, 257 translations 84 values 184, 186, 187, 203 ‘Barometer’ surveys 5, 81 Belgium 11 contact attempts 120, 121, 122, 123 political participation 218, 220, 222, 223 refusal conversion 124, 125 response, refusal and non-contact rates 117, 120, 123
Belgium cont. sampling 38, 39, 41, 43, 44, 46, 47, 68 social participation 212, 213, 216, 222, 223, 253, 254, 259, 260 social tolerance 258 social trust 254, 257 translations 79, 84, 85 values 184, 186, 187, 203 Beliefs in Government project 2 benevolence values 174, 175, 176, 181, 182, 184, 185, 203 age and 189, 190 definition 174 education and 191 gender and 191 income and 191 social involvement and 194, 195, 204 social trust and 193, 204 Beslan school seige 105, 106 bias 58 courtesy 6 non-response 110, 112–13, 127–30, 131, 161, 163 self-presentation 179 Britain see United Kingdom (UK) British Social Attitudes Survey (BSA) 42, 158 capacity building 165–66 Central Coordinating Team (CCT) 8, 12, 15, 27 n.4, 53, 146, 161 city dwellers 128, 163, 191, 192, 204 clustering 16, 40, 41 sample design effects due to (DEFFc) 45–7, 48 coding of data 6, 142–4 cognitive interviewing 57 concepts, translation into questions 55 concepts-by-intuition 56–7 concepts-by-postulation 56–7, 61–7 conceptual equivalence 78 confidentiality 141 conformity values 173, 175, 176, 181, 183, 184, 196, 204 age and 188, 190, 192 definition 174 education and 189, 190, 192
Jowell-Index.qxd
3/9/2007
8:08 PM
Page 266
266
conformity values cont. gender and 190 income and 190 mean importance scores 203 conservation values 175, 176, 181, 182, 184, 186, 187, 188, 195, 203 consistency 159–60 contact attempts 119–23, 130 contact forms 112, 114–15 contact procedures 120 contactability 121–7 contextual data 25–6, 147, 149–50 continuity 11–12 contract adherence 19–21 copy-editing 88 copyright 139 corruption 224, 226n., 228 costs, data collection 36 cultural barriers to data access 138 cultural context 25–6, 147, 150–1 cultural embeddedness 193, 195, 204 cultural values 172–73, 187–8, 192, 193, 194, 196, 197 Czech Republic 11, 68, 104, 105 political participation 218, 220, 222, 223, 228 response, refusals and non-contact rates 117, 118, 130 sampling 38, 39, 41, 43, 44, 45, 46, 47 social participation 212, 216, 222, 223, 228, 253, 254, 256 social tolerance 258 social trust 257 translations 84 values 184, 186, 188, 203 Data Documentation Initiative (DDI) 147, 153 democracy 220–21, 224, 225, 243, 250 demographic factors 25, 128–9, 147, 163, 191–2, 224, 225 Denmark 11, 68, 105 political participation 218–23 passim response, refusal and noncontact rates 117 sampling 38, 39, 41, 43, 44, 46 social capital 248 social participation 216, 217, 221, 222, 223, 248, 259 associational membership 212, 213, 216, 252, 253, 254, 260 social tolerance 258 social trust 248, 254, 257 translations 84 values 184, 186, 203
Index
dissemination of data 25–6, 146–54 Document Type Definition (DTD) 153 education 25, 147, 191, 192, 195, 204 coding frames 143, 145 government expenditure on 224, 225 and political participation 209, 224 and social capital 193, 194, 195 and social participation 194, 224 and survey participation 128, 163 and values 189–90, 192, 197 EduNet 26, 165–6 egalitarianism 192, 193, 194, 204 elections 98, 99, 103, 104 employment status 191, 193, 204 equivalence 6–9, 138, 140–1 conceptual 78 sampling 9, 32, 33–5, 36, 162 semantic 78 see also standardisation errors measurement 163–4 random and systematic (respondents) 55 survey 157, 161, 162–64 Estonia 11, 68, 69, 104 sampling 38, 39, 41, 43, 44, 46 translations 79, 84, 85 European Commission (EC) 2, 10, 26, 108, 154, 166, 167 European Science Foundation (ESF) 2, 26, 167 Blueprint document 2–3, 8, 79, 94, 95 European Social Survey (ESS) documentation 142, 144–7, 149–50 organisational structure 12–15, 161 participating countries 11 website 25–6, 138, 141–42, 147–54, 160–1, 165 workpackages 15–26 see also individual topics European Union 94, 99, 104 European Values Surveys 5, 81 event(s) impact on social capital 250 monitoring and reporting 24, 93–110 experience and values 171–2 feeling(s) 64, 66, 67, 180 and values 170, 172 fieldwork 11, 115–16 commissioning 18–19 documents 149, 150 monitoring 19–21, 113, 114 periods 11, 20, 113, 115–16
Jowell-Index.qxd
3/9/2007
8:08 PM
Page 267
Index
Finland 11, 70, 105 contact attempts 121, 122, 123 political participation 218, 220, 222, 223, 228 refusal conversion 124, 125 response, refusal and non-contact rates 117, 118, 123 sampling 38, 39, 41, 43, 44, 46 social participation 216, 222, 223, 228, 259 associational membership 212, 213, 216, 253, 254, 260 social tolerance 258 social trust 254, 257 translations 79, 84, 85 values 184, 186, 203 France 11, 99, 104 political participation 218, 220, 222, 223 response, refusal and non-contact rates 117 sampling 38, 39, 43, 44, 45, 46, 47 social participation 212, 216, 222, 223, 253, 254, 259, 260 social tolerance 258 social trust 254, 257 translations 84, 85 values 182, 184, 186, 187, 203 Funders’ Forum 13, 28 n.6 funding 3, 10, 11–12, 26–7, 166, 167 GDP 191, 225, 227 gender 25, 147, 189, 190, 191, 204 generalised reciprocity 239, 240, 241, 242, 251, 255, 256 geographical coverage 37 Germany 11, 68, 98, 130 ALLBUS survey 158 contact attempts 120, 121, 122, 123 political participation 129, 217, 218, 220, 222, 223 refusal conversion 125, 128, 130 response, refusal and non-contact rates 117, 118, 120, 121, 128 sampling 38, 39, 42, 43, 44, 46, 47 social participation 212, 213, 216, 221, 223, 253, 254, 259, 260 social tolerance 258 social trust 129, 254, 257 translations 84, 85 values 182, 184, 186, 203 goals and values 170, 172 government effectiveness of 224, 225–6, 227, 243 expenditure 224, 225 Greece 11, 38 contact attempts 119, 121, 122, 123 political participation 218–23 passim, 228
267
Greece cont. refusal conversion 116, 121, 126 response, refusal and non-contact rates 117, 118, 121 sampling 38, 39, 43, 44, 46, 47, 48 social participation 216, 217, 221, 222, 223, 228, 259 associational membership 211, 212, 216, 252, 253, 254, 260 social tolerance 258 social trust 254, 257, 258 values 184, 187, 188, 204 Groves, R.M. 157–8 health, government expenditure on 224, 225 hedonism values 175, 176, 181, 182n., 183, 184, age and 188–9, 190 definition 174 education and 190 gender and 190 income and 190 mean importance scores 203 social involvement and 194, 204 helping behaviour 214–17, 227, 228, 229, 232, 234 hierarchical linear modelling (HLM) 191 Hofstede, G. 192 Human Development Index (HDI) 191–2, 194, 195, 196, 197, 204 Human Values Scale 26, 64–7, 166, 170, 177–9, 201–2 Hungary 11, 106, 107 contact attempts 121, 122, 123 fieldwork periods 116 political participation 218–23 passim, 228 refusal conversion 124–5 response, refusal and non-contact rates 117, 118, 121 sampling 38, 39, 41, 43, 44, 46, 47 social participation 216, 217, 221, 222, 220, 225, 244, 255 associational membership 212, 213, 216, 252, 253, 254, 260 social tolerance 258 social trust 254, 257 translations 84, 85 values 184, 186, 204 Hutton Inquiry 101 Iceland 11, 38, 39, 41, 43, 44, 46, 84 immigration 105, 128, 129, 130, 131, 149, 190–1 importance assertions 64, 65, 66, 67, 179–80 income 190–1, 191, 193, 194, 197, 204 equality/inequality 224, 225, 227
Jowell-Index.qxd
3/9/2007
8:08 PM
Page 268
268
Inglehart, R. 170–73, 192 institutional barriers to data access 139 intellectual autonomy 196, 204 Inter-university Consortium for Political and Social Research (ICPSR) 108n. International Social Survey Programme (ISSP) 5, 7, 79, 154, 158, 161 interviewer effects 41 interviewing 11, 20, 56, 113, 115 intra-class correlation 41, 45 Iraq 96, 100, 103, 105, 106 Ireland 11, 21 contact attempts 119, 120, 121, 122, 123, 130 political participation 218, 220, 222, 223, 228 refusal conversion 124–5 response, refusal and non-contact rates 117, 118, 120, 121 sampling 37, 38, 39, 41, 43, 44, 45, 46, 47 social participation 216, 222, 223, 228, 259 associational membership 212, 213, 217, 248, 253, 254, 260 social tolerance 258 social trust 254, 257, 258 values 184, 186, 187, 203 Israel 11, 38 contact attempts 121, 122, 123 media-reported events 103, 105, 106 political participation 217, 218, 220, 221 refusal conversion 124, 125 response, refusal and noncontact rates 117, 118, 121 sample design effects 44, 45, 46 social participation 212, 213, 216, 222, 253, 254, 260 social tolerance 258 social trust 254, 257 translations 80, 83, 84 values 184, 186, 187, 188, 203 Italy 11, 38 contact attempts 123, 124, 125 media-reported events 98, 99, 105–6 political participation 217, 218, 220, 222, 223, 228 refusal conversion 124, 125 response, refusal and non-contact rates 117, 118, 121 sample design effects 44, 46 social participation 212, 216, 222, 223, 228, 253, 254, 259, 260 social tolerance 258 social trust 254, 257 translations 84, 85 item-non-response 56, 57
Index
John Paul II, Pope 100, 105 Kansas Event Data System project (KEDS) 109 Kennedy, John F. 95 Kish, Leslie 34 language(s) 7 English as official 8–9 first 37, 79 multiple and shared 79–80, 83–5 source and target 78–9 see also translation(s) law and order 224, 225, 227 legal barriers to data access 139 Lexis-Nexis 109–10 life course 188–9, 250, 261 life expectancy 192, 193, 204, 224, 225 life satisfaction 47 Luxembourg 11, 68 contact attempts 120, 121, 122, 123 political participation 218, 220, 222, 223 refusal conversion 125 response, refusal and non-contact rates 117, 118, 121, 130 sampling 38, 39, 43, 44, 45, 46 social participation 212, 213, 216, 222, 223, 253, 254, 259, 260 social tolerance 258 social trust 257 translations 80, 83, 84, 85 MADIERA portal 155 measurement errors 163–4 of values 177–82 media-reported events 93–110 Medien Tenor 99 method effect 56, 58, 59 methodology 7–9 Methods Group 14, 28 n.7 monitoring 19–21, 113, 114 event 24, 93–110 multidimensional scaling (MDS) 182, 186 Multitrait Multimethod (MTMM) experiments 21, 23, 53, 57–8, 60–1, 66–70 National Coordinators 14–15, 28–9 n.9, 53, 161 National Technical Summaries 112, 114, 116, 145–6
Jowell-Index.qxd
3/9/2007
8:08 PM
Page 269
Index
needs and values 171 NESSTAR 25, 148, 149, 152–5 Netherlands 11, 21 contact attempts 121, 122, 123 media reported events 96, 98, 99, 103, 104, 105, 106 political participation 129, 207, 209, 218, 220, 222, 223, 228 refusal conversion 124, 125–6, 128, 130 response, refusal and non- contact rates 117, 118, 121, 128, 130 sampling 38, 39, 43, 44, 46 social participation 216, 217, 221, 222, 223, 228, 259 associational membership 212, 213, 216, 252, 253, 254, 260 social tolerance 258 social trust 129, 254, 257 translations 84, 85 values 184, 186, 187, 203 new social movements 248, 249 newspaper reporting 97–103, 107, 108 Nie, N. 206–7 non-contacts 110, 112, 115, 117, 118 reduction of 119–27 non-response 110, 112, 114, 116–18 non-response bias 110, 112–13, 127–30, 131, 161, 163 Norway 11, 68, 105 political participation 207, 218–23 passim, 228 response, refusal and non- contact rates 117 sampling 38, 39, 41, 43, 44, 45, 46 social participation 216, 217, 221, 222, 223, 228, 259 associational membership 211, 212, 213, 216, 248, 252, 253, 254, 260 social tolerance 258 social trust 254, 257, 258 translations 84 values 184, 186, 203 Norwegian Data Services (NSD) 24, 26, 140, 143, 146 openness to change 175, 176, 181, 182, 183–4, 186, 187, 188–9, 195, 203 organisational membership see associational membership ownership of data 138, 139 participation survey, ESS participating countries 11–12 individual see response rates see also political participation; social participation
269
Pfetsch, B. 97 piloting 21, 52, 54 PISA report 103–4 Poland 11, 21, 37, 68, 98, 104 contact attempts 120, 121, 122, 123 General Social Survey 158 political participation 218–23 passim, 228 refusal conversion 124, 125 response, refusal and non- contact rates 117, 118, 121 sampling 38, 39, 43, 44, 46, 47 social participation 216, 217, 221, 222, 223, 228, 259 associational membership 211, 212, 216, 248, 252, 253, 254, 260 social tolerance 258 social trust 254, 257, 258 translations 84 values 184, 186, 203 political efficacy 61–3, 195, 204 political participation 128, 129, 130, 131, 195–6, 205–9, 210, 217–32, 235, 236, 237, 243, 249, 250 political stability 224, 225, 227, 228 political trust 99, 129, 130, 131, 243 political values 46–7 population characteristics 25, 128–9, 147, 163, 191–2, 224, 225 coverage 37 definition of 33, 37 Portrait Values Questionnaire 177, 182 Portugal 11, 68, 105 contact attempts 121, 122, 123, 130 political participation 218–23 passim, 228 refusal conversion 125 response, refusal and non- contact rates 117, 121 sampling 38, 39, 41, 43, 44, 45, 46, 47 social participation 216, 217, 221, 222, 223, 228, 259 associational membership 212, 213, 216, 252, 253, 254, 260 social tolerance 258 social trust 254, 257 translations 84 values 184, 186, 187, 203 power values 173, 174, 175, 176, 181, 183, 185, 189, 190, 203 precision of estimated differences 33–4, 35–6, 42 Prestige oil spill 93, 94, 103, 105 principal component analysis (PCA) 207–8, 213, 217–19, 221–2, 230, 233–7 probability sampling 33, 41– 2, 157–8, 162 procedural differences 7
Jowell-Index.qxd
3/9/2007
8:08 PM
Page 270
270
protest politics 98, 103, 105–6, 219–21, 222–23, 227, 232, 236 Putnam, Robert 193, 238, 240, 241, 243, 247, 248 quality of data 21–2 see also questions, quality of; response quality Question Module Design Teams 13, 28 n.8, 53 questionnaire(s) annotation of English (source) 54, 86–6 design 16, 52 core topics 16, 23, 53 rotating modules 14, 16, 23 stages of 53–5 online availability 149 pre-tests 56, 80 supplementary 55, 59–60 questions 6 agree/disagree 62, 63 background 79 differences of interpretation 54–5 double-barrelled 64 evaluation of 55–70 Multitrait Multimethod (MTMM) analysis 23, 52, 56–7, 59–60 quality of 21, 53, 56–70 predictions 23, 52, 53, 60 reliability and validity 22–4, 52, 53, 61, 62, 66, 70 response categories 62–3 reference points 68, 69 referendums 96, 101, 106 refusal conversion 114, 124–7, 130 refusals 110, 112, 115, 117–18, 124, 125 hard and soft 126, 127 regression analysis 226 reliability 56, 58, 59 of questions 22–3, 52, 53, 62, 63, 66, 70 reliability coefficient 58 religion 144, 191, 193, 194, 204, 224, 225, 248 representativeness 33 respondents cooperative/reluctant 126, 127–30, 131 numbers 20 variables affecting survey participation 127–9, 131, 163 response categories 62–3 reference points 68, 69 response quality, standards and documentation 113–15 response rates 6, 21–2, 110–34, 163 targets 20, 22, 113–14
Index
response tendencies 180–1, 196–7 responses centring of 180 random errors in 55 systematic errors in 55 rigour 4–6, 9 risk aversion 65, 66 rule of law 224, 225, 226, 227, 228 sample designs 36, 39, 40–1, 49–50, 162 design effects 7, 34, 40, 41, 42, 44–7, 48, 50 design weights 41–2, 43 sample sizes 35–6, 48, 116 minimum effective 34, 41, 42, 48, 162 samples, selection probabilities 33, 41–2, 47 sampling 16–17, 32–51 equivalence 9, 32, 33–5, 36, 162 probability 33, 41–2, 157– 8, 162 simple random (SRS) 34, 35 sampling frames 37–9, 163 Sampling Panel 16–17, 29 n.10, 36, 49, 145, 162 Scientific Advisory Board (SAB) 13–14, 27–8 n.5, 53 secularisation 248, 249 security values 174, 175, 176, 181, 183, 184, 188, 189, 190, 192, 193, 203, 204 selection probabilities 33, 41–2, 162 sample design effects due to (DEFFp) 42, 44–5, 47, 48 self-direction values 173, 174, 175, 176, 181, 183, 188, 189, 190, 195, 196, 203, 204 self-enhancement 175, 176, 181, 182, 183, 186, 187, 188, 189, 203 self-transcendence 175, 176, 181, 182, 183, 186, 187, 188, 189, 203 semantic equivalence 80 Slovakia 11, 38, 39, 41, 43, 44, 46, 79, 84, 85, 104 Slovenia 11, 106, 107 contact attempts 119, 120, 121, 122, 123 political participation 218, 220, 222, 223, 228 refusal conversion 124, 125 response, refusal and non- contact rates 117, 118, 123 sampling 38, 39, 41, 43, 44, 46, 47 social participation 216, 222, 223, 228, 259 associational membership 212, 213, 216, 253, 254, 260 social tolerance 258 social trust 254, 257 translations 84 values 184, 186, 203
Jowell-Index.qxd
3/9/2007
8:08 PM
Page 271
Index
Smallest Space Analysis 182 sociability, informal 214–15, 216, 217, 229, 238, 241, 242, 243–44, 247 age and 255, 256–7, 261, 262 frequency of 259, 265 social capital 193–5, 239–65 see also associational membership; sociability; social participation social change, and value priorities 188 social connectedness 238, 240, 241, 242, 255, 256 Social and Cultural Planning Office 18, 24 social indicators 9, 165, 224 social integration 128, 129, 130 social (interpersonal) trust 240, 241, 248, 249, 251, 252, 254, 261, 265 age and 255, 256, 257, 258 political trust and 243 societal cooperation and 242 survey participation and 129 United States 238, 243–45, 247 values and 193 social participation 128, 194–5, 205–6, 210–17, 221–29, 231, 232, 234, 237, 238–65 social tolerance 240, 241, 242, 249, 265 age and 255, 256, 258, 259, 261, 262 United States 239 socialisation 249 socio-cultural context 25–6, 147, 150–1 socio-demographic factors 25, 128–9, 147, 162, 191, 192, 224, 225 Soviet Union 248 Spain 11, 37, 68, 69, 103, 105 contact attempts 121, 122, 123, 130 political participation 218, 220, 222, 223, 228 political trust 93–4 refusal conversion 124, 125 response, refusal and non- contact rates 117, 118, 121 sampling 38, 39, 42, 43, 44, 45, 46, 47 social participation 216, 222, 223, 228, 259 associational membership 212, 213, 216, 253, 254, 260 social tolerance 258 social trust 254, 257 translations 79, 84 values 184, 186, 187, 188, 203 Split Ballot MTMM (SB-MTMM) 61 standardisation 140–47 standards, values as 171, 172 stimulation values 175, 176, 181, 183, 184, 203 age and 188, 190 definition 174
271
stimulation values cont. education and 189, 190 gender and 190 income and 190 political participation and 196, 204 social participation and 194, 195, 204 strikes 103, 105–6 suburbanisation 240, 247, 260, 262 survey climate 119 Survey on Health, Ageing and Retirement in Europe (SHARE) 81, 161 Survey Quality Program (SQP) 23, 53, 61–4 Sweden 11, 68, 98, 105 political participation 218–23 passim, 228 response, refusal and non- contact rates 117 sampling 38, 39, 41, 43, 44, 46, 48 social participation 216, 217, 221, 222, 223, 228, 259 associational membership 212, 213, 216, 248, 252, 253, 254, 260 social tolerance 258 social trust 254, 257, 258 translations 84, 85 values 184, 186, 203 Switzerland 11, 21, 68, 105, 128 contact attempts 122, 123 political participation 209, 218–23 passim, 228 refusal conversion 124, 125, 128 response, refusal and non- contact rates 117, 118, 130 sampling 38, 39, 43, 44, 46 social participation 212, 216, 221, 222, 223, 228, 259 social tolerance 258 social trust 257 translations 80, 83, 84, 85 values 184, 186, 187, 188, 203 systematic method effect 59 teaching and training, ESS as source for 26, 165–6 technological change 240, 247 telephone contacts 120 television 97, 98, 239, 240, 247, 261, 262 terrorist attacks 94, 96, 103 ‘think aloud’ protocols 56 tolerance see social tolerance tradition values 173, 174, 175, 176, 181, 183, 184 age and 188, 190 education and 189, 190 gender and 190 income and 190 mean importance scores 203
Jowell-Index.qxd
3/9/2007
8:08 PM
Page 272
272
Translation Expert Panel 17, 29 n.11, 54, 79, 80 translation(s) 8–9, 17–18, 54, 77–92, 164 adjudicators 81 advance 87 back 82, 164 definition 174 documentation templates 86–7, 88 errors 89 multiple 79–80, 83–4 organisation and specification 79–81 query hotline and FAQs 86 review/reviewers 81, 88 split and parallel 82–3 TRAPD procedure 81–2, 86 translators 81 transparency 160–1 Transparency International 224 trust political 99, 129, 130, 131, 243 social see social (interpersonal) trust Turkey 11, 38, 39, 84, 104 Ukraine 11, 103, 104 sampling 38, 39, 40, 41, 43, 44, 45, 46, 47 translations 79, 84, 85 United Kingdom (UK) 11, 21, 37, 68, 98, 99, 128 contact attempts 121, 122, 123, 130 political participation 207, 208, 218, 220, 222, 223, 228, 230 refusal conversion 124, 125, 128 response, refusal and non- contact rates 117, 123 sampling 38, 39, 41, 43, 44, 46, 47 social participation 216, 222, 223, 228, 259 associational membership 212, 213, 216, 253, 254, 260 social tolerance 258 social trust 254, 257, 258 values 184, 186, 187, 203 United States 104, 207, 209, 238, 243–47, 262 General Social Survey 158, 244, 245, 246 universalism values 175, 176, 181, 183, 184, 192, 204 age and 190, 191, 192 and associational membership 194–5, 204 universalism values cont.
Index
definition 174 education and 189–90, 192 gender and 190 income and 190 mean importance scores 203 and political participation 196, 204 and social trust 193, 204 urbanisation 25, 148, 224, 225 validity 56, 58, 59 of questions 22–3, 52, 53, 62, 63, 66, 70 validity coefficient 58 value priorities 169, 171–73, 180, 183–4, 185–8, 196–7 value relations, structure of 174–6, 182–85, 190–1, 196 value(s) 64–7, 169–204 basic 172, 173–4 comprehensiveness of 176–7 reliability of 181–2 sources of individual differences 188–91 circular motivational structure 175–6, 182– 83, 184, 190–1 cultural 172–73, 187–8, 192, 193, 194, 196, 197 higher-order 181, 186–8 measurement of 177–82 nature of 170–2 self-reported 177 Van Gogh, Theo 96, 103, 104, 106 Verba, S. 206–7, 209 voluntary associations see associational membership voting turnout 209, 219, 221, 222–23, 249 wealth, national 224, 225, 226, 228 website, ESS Archive 141–42, 165 publicly accessible 25–6, 138, 148–55, 160–1, 165 welfare state 215, 217, 228 women 240, 247, 261 World Health Organisation (WHO) 81, 161 World Values Surveys 5, 79, 252 Zentralarchiv für Empirische Sozialforschung (ZA) 154
Jowell-Index.qxd
3/9/2007
8:08 PM
Page 273
Jowell-Index.qxd
3/9/2007
8:08 PM
Page 274
Jowell-Index.qxd
3/9/2007
8:08 PM
Page 275
Jowell-Index.qxd
3/9/2007
8:08 PM
Page 276